Patents by Inventor Anthony Aue

Anthony Aue has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240412011
    Abstract: Systems and methods are provided for implementing URL embeddings for aligning parallel documents that are corresponding web pages in at least two different languages. A computing system uses a pre-trained model of an AI system to calculate URL embeddings for each URL among a plurality of URLs. The system identifies, based on closeness of the points represented by the URL embeddings, a set of candidate parallel URLs by analyzing the URL embeddings for the plurality of URLs or for a second plurality of URLs that has been partitioned into a cluster, using a clustering algorithm. A set of parallel URLs, associated with the parallel documents, is selected from the identified set of candidate parallel URLs. Document text and/or parallel sentences are extracted from web documents associated with the set of parallel URLs to train a machine translation model for translating between two or more languages.
    Type: Application
    Filed: June 9, 2023
    Publication date: December 12, 2024
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Hieu Trong Hoang, Marcin Junczys-Dowmunt, Anthony Aue
  • Patent number: 9614969
    Abstract: The disclosure pertains to a communication system for effecting a voice or video call between at least a source user speaking a source language and a target user speaking a target language. A translation procedure is performed on call audio of the call to generate an audio translation of the source user's speech in the target language for outputting to the target user. A notification is outputted to the target user to notify the target user of a change in the behavior of the translation procedure, the change relating to the generation of the translation.
    Type: Grant
    Filed: February 13, 2015
    Date of Patent: April 4, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Anthony Aue, Arul A. Menezes, Jonas Nils Lindblom, Fredrik Furesjö, Pierre P. N. Greborio
  • Patent number: 9400787
    Abstract: The claimed subject matter provides a system and/or method for segmenting a multi-language text. An exemplary method comprises determining an initial probability distribution for sentences in the multi-language text, the initial probability distribution indicating the likelihood of each sentence being in each of a set of languages. A probability of language transitions across sentences may be learned based on the initial probability distribution. Additionally, a highest probability language sequence of sentences in the multi-language text may be determined based on a combination of the probability of language transitions and the prior probability distribution provided by an initial model. Further, web documents are annotated at a sentence by sentence level such that each sentence of a web document is labeled in a given language according to the highest probability language determined.
    Type: Grant
    Filed: November 6, 2013
    Date of Patent: July 26, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventor: Anthony Aue
  • Patent number: 9223780
    Abstract: The described implementations relate to processing of electronic data. One implementation is manifested as a system that can include a cache module and at least one processing device configured to execute the cache module. The cache module can be configured to store data items in slots of a cache structure, receive a request for an individual data item that maps to an individual slot of the cache structure, and, when the individual slot of the cache structure is not available, return without further processing the request. For example, the request can be received from a calling application or thread that can proceed without blocking irrespective of whether the request is fulfilled by the cache module.
    Type: Grant
    Filed: December 19, 2012
    Date of Patent: December 29, 2015
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Anthony Aue, Arul A. Menezes
  • Publication number: 20150350451
    Abstract: The disclosure pertains to a communication system for effecting a voice or video call between at least a source user speaking a source language and a target user speaking a target language. A translation procedure is performed on call audio of the call to generate an audio translation of the source user's speech in the target language for outputting to the target user. A notification is outputted to the target user to notify the target user of a change in the behaviour of the translation procedure, the change relating to the generation of the translation.
    Type: Application
    Filed: February 13, 2015
    Publication date: December 3, 2015
    Inventors: Anthony Aue, Arul A. Menezes, Jonas Nils Lindblom, Fredrik Furesjö, Pierre P.N. Greborio
  • Publication number: 20150347399
    Abstract: Call audio of a call between a source user speaking a source language and a target user speaking a target language is received from a remote source user device of a source user via a communication network of a communication system, the call audio comprising speech of the source user in the source language. An automatic speech recognition procedure is performed on the call audio. A translation of the source user's speech is generated in the target language using the results of the speech recognition procedure. A translated synthetic speech audio version of the source user's speech is mixed with the source user's call audio and/or with translated audio of the target user's speech in the source language. The mixed audio signal is transmitted to a remote target user device of the target user via the communication network for outputting to at least the target user during the call.
    Type: Application
    Filed: February 11, 2015
    Publication date: December 3, 2015
    Inventors: Anthony Aue, Arul A. Menezes, Jonas Nils Lindblom, Fredrik Furesjö, Pierre P.N. Greborio
  • Patent number: 9183197
    Abstract: Automated language translation often involves language translation resources of significant size (e.g., 50-gigabyte phrase tables) and significant computational power exceeding the capabilities of many mobile devices. Remotely accessible servers capable of near-realtime, automated translation may be inaccessible or prohibitively costly while traveling abroad. Presented herein are adaptations of language translation techniques for offline mobile devices involving reducing the size and raising the efficiency of the language modeling resources. A word index may be provided that stores respective string representations of the words of a language, and maps respective words to a location (e.g., address or offset) of respective word representations within the word index. Language translation resources (e.g., phrase tables) may then specify logical relationships using the word index addresses of the involved words, rather than the string equivalents.
    Type: Grant
    Filed: December 14, 2012
    Date of Patent: November 10, 2015
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Ibrahim Eden, Christopher Quirk, Anthony Aue, Michel Galley, Frederik Schaffalitzky
  • Publication number: 20140172407
    Abstract: Automated language translation often involves language translation resources of significant size (e.g., 50-gigabyte phrase tables) and significant computational power exceeding the capabilities of many mobile devices. Remotely accessible servers capable of near-realtime, automated translation may be inaccessible or prohibitively costly while traveling abroad. Presented herein are adaptations of language translation techniques for offline mobile devices involving reducing the size and raising the efficiency of the language modeling resources. A word index may be provided that stores respective string representations of the words of a language, and maps respective words to a location (e.g., address or offset) of respective word representations within the word index. Language translation resources (e.g., phrase tables) may then specify logical relationships using the word index addresses of the involved words, rather than the string equivalents.
    Type: Application
    Filed: December 14, 2012
    Publication date: June 19, 2014
    Applicant: Microsoft Corporation
    Inventors: Ibrahim Eden, Christopher Quirk, Anthony Aue, Michel Galley, Frederik Schaffalitzky
  • Publication number: 20140173200
    Abstract: The described implementations relate to processing of electronic data. One implementation is manifested as a system that can include a cache module and at least one processing device configured to execute the cache module. The cache module can be configured to store data items in slots of a cache structure, receive a request for an individual data item that maps to an individual slot of the cache structure, and, when the individual slot of the cache structure is not available, return without further processing the request. For example, the request can be received from a calling application or thread that can proceed without blocking irrespective of whether the request is fulfilled by the cache module.
    Type: Application
    Filed: December 19, 2012
    Publication date: June 19, 2014
    Applicant: Microsoft Corporation
    Inventors: Anthony Aue, Arul A. Menezes
  • Publication number: 20140067365
    Abstract: The claimed subject matter provides a system and/or method for segmenting a multi-language text. An exemplary method comprises determining an initial probability distribution for sentences in the multi-language text, the initial probability distribution indicating the likelihood of each sentence being in each of a set of languages. A probability of language transitions across sentences may be learned based on the initial probability distribution. Additionally, a highest probability language sequence of sentences in the multi-language text may be determined based on a combination of the probability of language transitions and the prior probability distribution provided by an initial model. Further, web documents are annotated at a sentence by sentence level such that each sentence of a web document is labeled in a given language according to the highest probability language determined.
    Type: Application
    Filed: November 6, 2013
    Publication date: March 6, 2014
    Applicant: MICROSOFT CORPORATION
    Inventor: Anthony Aue
  • Patent number: 8600730
    Abstract: A system and method for segmenting a multi-language text is provided. An exemplary method comprises determining an initial probability distribution for sentences in the multi-language text, the initial probability distribution indicating the likelihood of each sentence being in each of a set of languages. A probability of language transitions across sentences may be learned based on the initial probability distribution. Additionally, a highest probability language sequence of sentences in the multi-language text may be determined based on a combination of the probability of language transitions and the prior probability distribution provided by an initial model.
    Type: Grant
    Filed: February 8, 2011
    Date of Patent: December 3, 2013
    Assignee: Microsoft Corporation
    Inventor: Anthony Aue
  • Publication number: 20130103695
    Abstract: Various technologies described herein pertain to detecting machine translated content. Documents in a document pair are mutual lingual translations of each other. Further, document level features of the documents in the document pair can be identified. The document level features can correlate with translation quality between the documents in the document pair. Moreover, statistical classification can be used to detect whether the document pair is generated through machine translation based at least in part upon the document level features. Further, a first document can be a machine translation of a second document in the document pair or a disparate document when generated through machine translation.
    Type: Application
    Filed: October 21, 2011
    Publication date: April 25, 2013
    Applicant: Microsoft Corporation
    Inventors: Spencer Taylor Rarrick, William Duncan Lewis, Christopher Brian Quirk, Anthony Aue
  • Patent number: 8271869
    Abstract: Technology is described for identifying language translations for source documents. The method includes finding source documents containing links to target documents and the link anchors of the links have language indicating text. A first tuple set can be generated for paired source documents and target documents with an expected target language for a target document. The first tuple set can be annotated with primary languages for the source documents and target documents to form a second tuple set where primary languages of the source documents and target documents are different. Further, a third tuple set can be generated using the second tuple set using a count of the number of times source documents and target documents occur in the first tuple set. Tuples can be removed from the third tuple set where a count ratio between source document count and target document count is less than a reference ratio.
    Type: Grant
    Filed: October 8, 2010
    Date of Patent: September 18, 2012
    Assignee: Microsoft Corporation
    Inventor: Anthony Aue
  • Publication number: 20120203540
    Abstract: The claimed subject matter provides a system and/or method for segmenting a multi-language text. An exemplary method comprises determining an initial probability distribution for sentences in the multi-language text, the initial probability distribution indicating the likelihood of each sentence being in each of a set of languages. A probability of language transitions across sentences may be learned based on the initial probability distribution. Additionally, a highest probability language sequence of sentences in the multi-language text may be determined based on a combination of the probability of language transitions and the prior probability distribution provided by an initial model.
    Type: Application
    Filed: February 8, 2011
    Publication date: August 9, 2012
    Applicant: Microsoft Corporation
    Inventor: Anthony Aue
  • Publication number: 20120089898
    Abstract: Technology is described for identifying language translations for source documents. The method includes finding source documents containing links to target documents and the link anchors of the links have language indicating text. A first tuple set can be generated for paired source documents and target documents with an expected target language for a target document. The first tuple set can be annotated with primary languages for the source documents and target documents to form a second tuple set where primary languages of the source documents and target documents are different. Further, a third tuple set can be generated using the second tuple set using a count of the number of times source documents and target documents occur in the first tuple set. Tuples can be removed from the third tuple set where a count ratio between source document count and target document count is less than a reference ratio.
    Type: Application
    Filed: October 8, 2010
    Publication date: April 12, 2012
    Applicant: Microsoft Corporation
    Inventor: Anthony Aue
  • Patent number: 7835902
    Abstract: A computer-implemented system and method for assessing the editorial quality of a textual unit (document, paragraph or sentence) is provided. The method includes generating a plurality of training-time feature vectors by automatically extracting features from first and last versions of training documents. The method also includes training a machine-learned classifier based on the plurality of training-time feature vectors. A run-time feature vector is generated for the textual unit to be assessed by automatically extracting features from the textual unit. The run-time feature vector is evaluated using the machine-learned classifier to provide an assessment of the editorial quality of the textual unit.
    Type: Grant
    Filed: October 20, 2004
    Date of Patent: November 16, 2010
    Assignee: Microsoft Corporation
    Inventors: Michael Gamon, Anthony Aue
  • Patent number: 7788087
    Abstract: The present invention provides a system for identifying, extracting, clustering and analyzing sentiment-bearing text. In one embodiment, the invention implements a pipeline capable of accessing raw text and presenting it in a highly usable and intuitive way.
    Type: Grant
    Filed: April 14, 2005
    Date of Patent: August 31, 2010
    Assignee: Microsoft Corporation
    Inventors: Simon H. Corston-Oliver, Anthony Aue, Eric K. Ringger, Michael Gamon
  • Patent number: 7788086
    Abstract: The present invention provides a system for identifying, extracting, clustering and analyzing sentiment-bearing text. In one embodiment, the invention implements a pipeline capable of accessing raw text and presenting it in a highly usable and intuitive way.
    Type: Grant
    Filed: April 14, 2005
    Date of Patent: August 31, 2010
    Assignee: Microsoft Corporation
    Inventors: Simon H. Corston-Oliver, Anthony Aue, Eric K. Ringger, Michael Gamon
  • Patent number: 7593843
    Abstract: A method of decoding an input semantic structure to generate an output semantic structure. A set of transfer mappings are provided. A score is calculated for at least one transfer mapping in the set of transfer mappings using a statistical model. At least one transfer mapping is selected based on the score and used to construct the output semantic structure.
    Type: Grant
    Filed: March 30, 2004
    Date of Patent: September 22, 2009
    Assignee: Microsoft Corporation
    Inventors: Anthony Aue, Eric K. Ringger, Christopher B. Quirk, Arul A. Menezes, Robert C. Moore
  • Publication number: 20060200342
    Abstract: The present invention provides a system for identifying, extracting, clustering and analyzing sentiment-bearing text. In one embodiment, the invention implements a pipeline capable of accessing raw text and presenting it in a highly usable and intuitive way.
    Type: Application
    Filed: April 14, 2005
    Publication date: September 7, 2006
    Applicant: Microsoft Corporation
    Inventors: Simon Corston-Oliver, Anthony Aue, Eric Ringger, Michael Gamon