Patents by Inventor Anthony Aue
Anthony Aue has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240412011Abstract: Systems and methods are provided for implementing URL embeddings for aligning parallel documents that are corresponding web pages in at least two different languages. A computing system uses a pre-trained model of an AI system to calculate URL embeddings for each URL among a plurality of URLs. The system identifies, based on closeness of the points represented by the URL embeddings, a set of candidate parallel URLs by analyzing the URL embeddings for the plurality of URLs or for a second plurality of URLs that has been partitioned into a cluster, using a clustering algorithm. A set of parallel URLs, associated with the parallel documents, is selected from the identified set of candidate parallel URLs. Document text and/or parallel sentences are extracted from web documents associated with the set of parallel URLs to train a machine translation model for translating between two or more languages.Type: ApplicationFiled: June 9, 2023Publication date: December 12, 2024Applicant: Microsoft Technology Licensing, LLCInventors: Hieu Trong Hoang, Marcin Junczys-Dowmunt, Anthony Aue
-
Patent number: 9614969Abstract: The disclosure pertains to a communication system for effecting a voice or video call between at least a source user speaking a source language and a target user speaking a target language. A translation procedure is performed on call audio of the call to generate an audio translation of the source user's speech in the target language for outputting to the target user. A notification is outputted to the target user to notify the target user of a change in the behavior of the translation procedure, the change relating to the generation of the translation.Type: GrantFiled: February 13, 2015Date of Patent: April 4, 2017Assignee: Microsoft Technology Licensing, LLCInventors: Anthony Aue, Arul A. Menezes, Jonas Nils Lindblom, Fredrik Furesjö, Pierre P. N. Greborio
-
Patent number: 9400787Abstract: The claimed subject matter provides a system and/or method for segmenting a multi-language text. An exemplary method comprises determining an initial probability distribution for sentences in the multi-language text, the initial probability distribution indicating the likelihood of each sentence being in each of a set of languages. A probability of language transitions across sentences may be learned based on the initial probability distribution. Additionally, a highest probability language sequence of sentences in the multi-language text may be determined based on a combination of the probability of language transitions and the prior probability distribution provided by an initial model. Further, web documents are annotated at a sentence by sentence level such that each sentence of a web document is labeled in a given language according to the highest probability language determined.Type: GrantFiled: November 6, 2013Date of Patent: July 26, 2016Assignee: Microsoft Technology Licensing, LLCInventor: Anthony Aue
-
Patent number: 9223780Abstract: The described implementations relate to processing of electronic data. One implementation is manifested as a system that can include a cache module and at least one processing device configured to execute the cache module. The cache module can be configured to store data items in slots of a cache structure, receive a request for an individual data item that maps to an individual slot of the cache structure, and, when the individual slot of the cache structure is not available, return without further processing the request. For example, the request can be received from a calling application or thread that can proceed without blocking irrespective of whether the request is fulfilled by the cache module.Type: GrantFiled: December 19, 2012Date of Patent: December 29, 2015Assignee: Microsoft Technology Licensing, LLCInventors: Anthony Aue, Arul A. Menezes
-
Publication number: 20150350451Abstract: The disclosure pertains to a communication system for effecting a voice or video call between at least a source user speaking a source language and a target user speaking a target language. A translation procedure is performed on call audio of the call to generate an audio translation of the source user's speech in the target language for outputting to the target user. A notification is outputted to the target user to notify the target user of a change in the behaviour of the translation procedure, the change relating to the generation of the translation.Type: ApplicationFiled: February 13, 2015Publication date: December 3, 2015Inventors: Anthony Aue, Arul A. Menezes, Jonas Nils Lindblom, Fredrik Furesjö, Pierre P.N. Greborio
-
Publication number: 20150347399Abstract: Call audio of a call between a source user speaking a source language and a target user speaking a target language is received from a remote source user device of a source user via a communication network of a communication system, the call audio comprising speech of the source user in the source language. An automatic speech recognition procedure is performed on the call audio. A translation of the source user's speech is generated in the target language using the results of the speech recognition procedure. A translated synthetic speech audio version of the source user's speech is mixed with the source user's call audio and/or with translated audio of the target user's speech in the source language. The mixed audio signal is transmitted to a remote target user device of the target user via the communication network for outputting to at least the target user during the call.Type: ApplicationFiled: February 11, 2015Publication date: December 3, 2015Inventors: Anthony Aue, Arul A. Menezes, Jonas Nils Lindblom, Fredrik Furesjö, Pierre P.N. Greborio
-
Patent number: 9183197Abstract: Automated language translation often involves language translation resources of significant size (e.g., 50-gigabyte phrase tables) and significant computational power exceeding the capabilities of many mobile devices. Remotely accessible servers capable of near-realtime, automated translation may be inaccessible or prohibitively costly while traveling abroad. Presented herein are adaptations of language translation techniques for offline mobile devices involving reducing the size and raising the efficiency of the language modeling resources. A word index may be provided that stores respective string representations of the words of a language, and maps respective words to a location (e.g., address or offset) of respective word representations within the word index. Language translation resources (e.g., phrase tables) may then specify logical relationships using the word index addresses of the involved words, rather than the string equivalents.Type: GrantFiled: December 14, 2012Date of Patent: November 10, 2015Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Ibrahim Eden, Christopher Quirk, Anthony Aue, Michel Galley, Frederik Schaffalitzky
-
Publication number: 20140172407Abstract: Automated language translation often involves language translation resources of significant size (e.g., 50-gigabyte phrase tables) and significant computational power exceeding the capabilities of many mobile devices. Remotely accessible servers capable of near-realtime, automated translation may be inaccessible or prohibitively costly while traveling abroad. Presented herein are adaptations of language translation techniques for offline mobile devices involving reducing the size and raising the efficiency of the language modeling resources. A word index may be provided that stores respective string representations of the words of a language, and maps respective words to a location (e.g., address or offset) of respective word representations within the word index. Language translation resources (e.g., phrase tables) may then specify logical relationships using the word index addresses of the involved words, rather than the string equivalents.Type: ApplicationFiled: December 14, 2012Publication date: June 19, 2014Applicant: Microsoft CorporationInventors: Ibrahim Eden, Christopher Quirk, Anthony Aue, Michel Galley, Frederik Schaffalitzky
-
Publication number: 20140173200Abstract: The described implementations relate to processing of electronic data. One implementation is manifested as a system that can include a cache module and at least one processing device configured to execute the cache module. The cache module can be configured to store data items in slots of a cache structure, receive a request for an individual data item that maps to an individual slot of the cache structure, and, when the individual slot of the cache structure is not available, return without further processing the request. For example, the request can be received from a calling application or thread that can proceed without blocking irrespective of whether the request is fulfilled by the cache module.Type: ApplicationFiled: December 19, 2012Publication date: June 19, 2014Applicant: Microsoft CorporationInventors: Anthony Aue, Arul A. Menezes
-
Publication number: 20140067365Abstract: The claimed subject matter provides a system and/or method for segmenting a multi-language text. An exemplary method comprises determining an initial probability distribution for sentences in the multi-language text, the initial probability distribution indicating the likelihood of each sentence being in each of a set of languages. A probability of language transitions across sentences may be learned based on the initial probability distribution. Additionally, a highest probability language sequence of sentences in the multi-language text may be determined based on a combination of the probability of language transitions and the prior probability distribution provided by an initial model. Further, web documents are annotated at a sentence by sentence level such that each sentence of a web document is labeled in a given language according to the highest probability language determined.Type: ApplicationFiled: November 6, 2013Publication date: March 6, 2014Applicant: MICROSOFT CORPORATIONInventor: Anthony Aue
-
Patent number: 8600730Abstract: A system and method for segmenting a multi-language text is provided. An exemplary method comprises determining an initial probability distribution for sentences in the multi-language text, the initial probability distribution indicating the likelihood of each sentence being in each of a set of languages. A probability of language transitions across sentences may be learned based on the initial probability distribution. Additionally, a highest probability language sequence of sentences in the multi-language text may be determined based on a combination of the probability of language transitions and the prior probability distribution provided by an initial model.Type: GrantFiled: February 8, 2011Date of Patent: December 3, 2013Assignee: Microsoft CorporationInventor: Anthony Aue
-
Publication number: 20130103695Abstract: Various technologies described herein pertain to detecting machine translated content. Documents in a document pair are mutual lingual translations of each other. Further, document level features of the documents in the document pair can be identified. The document level features can correlate with translation quality between the documents in the document pair. Moreover, statistical classification can be used to detect whether the document pair is generated through machine translation based at least in part upon the document level features. Further, a first document can be a machine translation of a second document in the document pair or a disparate document when generated through machine translation.Type: ApplicationFiled: October 21, 2011Publication date: April 25, 2013Applicant: Microsoft CorporationInventors: Spencer Taylor Rarrick, William Duncan Lewis, Christopher Brian Quirk, Anthony Aue
-
Patent number: 8271869Abstract: Technology is described for identifying language translations for source documents. The method includes finding source documents containing links to target documents and the link anchors of the links have language indicating text. A first tuple set can be generated for paired source documents and target documents with an expected target language for a target document. The first tuple set can be annotated with primary languages for the source documents and target documents to form a second tuple set where primary languages of the source documents and target documents are different. Further, a third tuple set can be generated using the second tuple set using a count of the number of times source documents and target documents occur in the first tuple set. Tuples can be removed from the third tuple set where a count ratio between source document count and target document count is less than a reference ratio.Type: GrantFiled: October 8, 2010Date of Patent: September 18, 2012Assignee: Microsoft CorporationInventor: Anthony Aue
-
Publication number: 20120203540Abstract: The claimed subject matter provides a system and/or method for segmenting a multi-language text. An exemplary method comprises determining an initial probability distribution for sentences in the multi-language text, the initial probability distribution indicating the likelihood of each sentence being in each of a set of languages. A probability of language transitions across sentences may be learned based on the initial probability distribution. Additionally, a highest probability language sequence of sentences in the multi-language text may be determined based on a combination of the probability of language transitions and the prior probability distribution provided by an initial model.Type: ApplicationFiled: February 8, 2011Publication date: August 9, 2012Applicant: Microsoft CorporationInventor: Anthony Aue
-
Publication number: 20120089898Abstract: Technology is described for identifying language translations for source documents. The method includes finding source documents containing links to target documents and the link anchors of the links have language indicating text. A first tuple set can be generated for paired source documents and target documents with an expected target language for a target document. The first tuple set can be annotated with primary languages for the source documents and target documents to form a second tuple set where primary languages of the source documents and target documents are different. Further, a third tuple set can be generated using the second tuple set using a count of the number of times source documents and target documents occur in the first tuple set. Tuples can be removed from the third tuple set where a count ratio between source document count and target document count is less than a reference ratio.Type: ApplicationFiled: October 8, 2010Publication date: April 12, 2012Applicant: Microsoft CorporationInventor: Anthony Aue
-
Patent number: 7835902Abstract: A computer-implemented system and method for assessing the editorial quality of a textual unit (document, paragraph or sentence) is provided. The method includes generating a plurality of training-time feature vectors by automatically extracting features from first and last versions of training documents. The method also includes training a machine-learned classifier based on the plurality of training-time feature vectors. A run-time feature vector is generated for the textual unit to be assessed by automatically extracting features from the textual unit. The run-time feature vector is evaluated using the machine-learned classifier to provide an assessment of the editorial quality of the textual unit.Type: GrantFiled: October 20, 2004Date of Patent: November 16, 2010Assignee: Microsoft CorporationInventors: Michael Gamon, Anthony Aue
-
Patent number: 7788087Abstract: The present invention provides a system for identifying, extracting, clustering and analyzing sentiment-bearing text. In one embodiment, the invention implements a pipeline capable of accessing raw text and presenting it in a highly usable and intuitive way.Type: GrantFiled: April 14, 2005Date of Patent: August 31, 2010Assignee: Microsoft CorporationInventors: Simon H. Corston-Oliver, Anthony Aue, Eric K. Ringger, Michael Gamon
-
Patent number: 7788086Abstract: The present invention provides a system for identifying, extracting, clustering and analyzing sentiment-bearing text. In one embodiment, the invention implements a pipeline capable of accessing raw text and presenting it in a highly usable and intuitive way.Type: GrantFiled: April 14, 2005Date of Patent: August 31, 2010Assignee: Microsoft CorporationInventors: Simon H. Corston-Oliver, Anthony Aue, Eric K. Ringger, Michael Gamon
-
Patent number: 7593843Abstract: A method of decoding an input semantic structure to generate an output semantic structure. A set of transfer mappings are provided. A score is calculated for at least one transfer mapping in the set of transfer mappings using a statistical model. At least one transfer mapping is selected based on the score and used to construct the output semantic structure.Type: GrantFiled: March 30, 2004Date of Patent: September 22, 2009Assignee: Microsoft CorporationInventors: Anthony Aue, Eric K. Ringger, Christopher B. Quirk, Arul A. Menezes, Robert C. Moore
-
Publication number: 20060200342Abstract: The present invention provides a system for identifying, extracting, clustering and analyzing sentiment-bearing text. In one embodiment, the invention implements a pipeline capable of accessing raw text and presenting it in a highly usable and intuitive way.Type: ApplicationFiled: April 14, 2005Publication date: September 7, 2006Applicant: Microsoft CorporationInventors: Simon Corston-Oliver, Anthony Aue, Eric Ringger, Michael Gamon