Patents by Inventor Alejandro Acero

Alejandro Acero has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20180330723
    Abstract: Systems and processes for operating a digital assistant are provided. In an example process, low-latency operation of a digital assistant is provided. In this example, natural language processing, task flow processing, dialogue flow processing, speech synthesis, or any combination thereof can be at least partially performed while awaiting detection of a speech end-point condition. Upon detection of a speech end-point condition, results obtained from performing the operations can be presented to the user. In another example, robust operation of a digital assistant is provided. In this example, task flow processing by the digital assistant can include selecting a candidate task flow from a plurality of candidate task flows based on determined task flow scores. The task flow scores can be based on speech recognition confidence scores, intent confidence scores, flow parameter scores, or any combination thereof. The selected candidate task flow is executed and corresponding results presented to the user.
    Type: Application
    Filed: August 17, 2017
    Publication date: November 15, 2018
    Inventors: Alejandro ACERO, Hepeng ZHANG
  • Patent number: 10055686
    Abstract: A deep structured semantic module (DSSM) is described herein which uses a model that is discriminatively trained based on click-through data, e.g., such that a conditional likelihood of clicked documents, given respective queries, is maximized, and a condition likelihood of non-clicked documents, given the queries, is reduced. In operation, after training is complete, the DSSM maps an input item into an output item expressed in a semantic space, using the trained model. To facilitate training and runtime operation, a dimensionality-reduction module (DRM) can reduce the dimensionality of the input item that is fed to the DSSM. A search engine may use the above-summarized functionality to convert a query and a plurality of documents into the common semantic space, and then determine the similarity between the query and documents in the semantic space. The search engine may then rank the documents based, at least in part, on the similarity measures.
    Type: Grant
    Filed: July 12, 2016
    Date of Patent: August 21, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alejandro Acero, Larry P. Heck
  • Patent number: 9984678
    Abstract: Various technologies described herein pertain to adapting a speech recognizer to input speech data. A first linear transform can be selected from a first set of linear transforms based on a value of a first variability source corresponding to the input speech data, and a second linear transform can be selected from a second set of linear transforms based on a value of a second variability source corresponding to the input speech data. The linear transforms in the first and second sets can compensate for the first variability source and the second variability source, respectively. Moreover, the first linear transform can be applied to the input speech data to generate intermediate transformed speech data, and the second linear transform can be applied to the intermediate transformed speech data to generate transformed speech data. Further, speech can be recognized based on the transformed speech data to obtain a result.
    Type: Grant
    Filed: March 23, 2012
    Date of Patent: May 29, 2018
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Michael Lewis Seltzer, Alejandro Acero
  • Patent number: 9928296
    Abstract: One or more techniques and/or systems are disclosed for creating an expanded or improved lexicon for use in search-based semantic tagging. A set of first documents can be identified using a set of first lexicon elements as queries, and one or more first document patterns can be extracted from the set of first documents. The document patterns can be used to find one or more second documents in a query log that comprise the document patterns, which are associated with query terms used to return the second documents. The query terms for the second documents can be extracted and used to expand the lexicon. Elements within the lexicon may be weighted based upon relevance to different query domains, for example.
    Type: Grant
    Filed: December 16, 2010
    Date of Patent: March 27, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Xiao Li, Jingjing Liu, Alejandro Acero, Ye-Yi Wang
  • Patent number: 9786284
    Abstract: This document describes various techniques for dual-band speech encoding. In some embodiments, a first type of speech feature is received from a remote entity, an estimate of a second type of speech feature is determined based on the first type of speech feature, the estimate of the second type of speech feature is provided to a speech recognizer, speech-recognition results based on the estimate of the second type of speech feature are received from the speech recognizer, and the speech-recognition results are transmitted to the remote entity.
    Type: Grant
    Filed: August 14, 2014
    Date of Patent: October 10, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Alejandro Acero, James G. Droppo, III, Michael L. Seltzer
  • Patent number: 9684741
    Abstract: A query may be applied against search engines that respectively return a set of search results relating to various items discovered in the searched data sets. However, presenting numerous and varied search results may be difficult on mobile devices with small displays and limited computational resources. Instead, search results may be associated with search domains representing various information types (e.g., contacts, public figures, places, projects, movies, music, and books) and presented by grouping search results with associated query domains, e.g., in a tabbed user interface. The query may be received through an input device associated with a particular input domain, and may be transitioned to the query domain of a particular search engine (e.g., by recognizing phonemes of a voice query using an acoustic model; matching phonemes with query terms according to a pronunciation model; and generating a recognition result according to a vocabulary of an n-gram language model.
    Type: Grant
    Filed: June 5, 2009
    Date of Patent: June 20, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Xiao Li, Patrick Nguyen, Geoffrey Zweig, Alejandro Acero
  • Patent number: 9519859
    Abstract: A deep structured semantic module (DSSM) is described herein which uses a model that is discriminatively trained based on click-through data, e.g., such that a conditional likelihood of clicked documents, given respective queries, is maximized, and a condition likelihood of non-clicked documents, given the queries, is reduced. In operation, after training is complete, the DSSM maps an input item into an output item expressed in a semantic space, using the trained model. To facilitate training and runtime operation, a dimensionality-reduction module (DRM) can reduce the dimensionality of the input item that is fed to the DSSM. A search engine may use the above-summarized functionality to convert a query and a plurality of documents into the common semantic space, and then determine the similarity between the query and documents in the semantic space. The search engine may then rank the documents based, at least in part, on the similarity measures.
    Type: Grant
    Filed: September 6, 2013
    Date of Patent: December 13, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alejandro Acero, Larry P. Heck
  • Publication number: 20160321321
    Abstract: A deep structured semantic module (DSSM) is described herein which uses a model that is discriminatively trained based on click-through data, e.g., such that a conditional likelihood of clicked documents, given respective queries, is maximized, and a condition likelihood of non-clicked documents, given the queries, is reduced. In operation, after training is complete, the DSSM maps an input item into an output item expressed in a semantic space, using the trained model. To facilitate training and runtime operation, a dimensionality-reduction module (DRM) can reduce the dimensionality of the input item that is fed to the DSSM. A search engine may use the above-summarized functionality to convert a query and a plurality of documents into the common semantic space, and then determine the similarity between the query and documents in the semantic space. The search engine may then rank the documents based, at least in part, on the similarity measures.
    Type: Application
    Filed: July 12, 2016
    Publication date: November 3, 2016
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alejandro Acero, Larry P. Heck
  • Patent number: 9390371
    Abstract: A method is disclosed herein that includes an act of causing a processor to access a deep-structured, layered or hierarchical model, called a deep convex network, retained in a computer-readable medium, wherein the deep-structured model comprises a plurality of layers with weights assigned thereto. This layered model can produce the output serving as the scores to combine with transition probabilities between states in a hidden Markov model and language model scores to form a full speech recognizer. Batch-based, convex optimization is performed to learn a portion of the deep convex network's weights, rendering it appropriate for parallel computation to accomplish the training. The method can further include the act of jointly substantially optimizing the weights, the transition probabilities, and the language model scores of the deep-structured model using the optimization criterion based on a sequence rather than a set of unrelated frames.
    Type: Grant
    Filed: June 17, 2013
    Date of Patent: July 12, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Li Deng, Dong Yu, Alejandro Acero
  • Patent number: 9264807
    Abstract: A multichannel acoustic echo reduction system is described herein. The system includes an acoustic echo canceller (AEC) component having a fixed filter for each respective combination of loudspeaker and microphone signals and having an adaptive filter for each microphone signal. For each microphone signal, the AEC component modifies the microphone signal to reduce contributions from the outputs of the loudspeakers based at least in part on the respective adaptive filter associated with the microphone signal and the set of fixed filters associated with the respective microphone signal.
    Type: Grant
    Filed: January 23, 2013
    Date of Patent: February 16, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Ivan Jelev Tashev, Alejandro Acero, Nilesh Madhu
  • Patent number: 9218412
    Abstract: A database having listings rather than long documents is searched using a term frequency-inverse document frequency (Tf/Idf) algorithm.
    Type: Grant
    Filed: May 10, 2007
    Date of Patent: December 22, 2015
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Ye-Yi Wang, Dong Yu, Yun-Cheng Ju, Alejandro Acero, Geoffrey G. Zweig
  • Patent number: 9054764
    Abstract: A novel beamforming post-processor technique with enhanced noise suppression capability. The present beamforming post-processor technique is a non-linear post-processing technique for sensor arrays (e.g., microphone arrays) which improves the directivity and signal separation capabilities. The technique works in so-called instantaneous direction of arrival space, estimates the probability for sound coming from a given incident angle or look-up direction and applies a time-varying, gain based, spatio-temporal filter for suppressing sounds coming from directions other than the sound source direction, resulting in minimal artifacts and musical noise.
    Type: Grant
    Filed: July 20, 2011
    Date of Patent: June 9, 2015
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Ivan Tashev, Alejandro Acero
  • Patent number: 9009039
    Abstract: Technologies are described herein for noise adaptive training to achieve robust automatic speech recognition. Through the use of these technologies, a noise adaptive training (NAT) approach may use both clean and corrupted speech for training. The NAT approach may normalize the environmental distortion as part of the model training. A set of underlying “pseudo-clean” model parameters may be estimated directly. This may be done without point estimation of clean speech features as an intermediate step. The pseudo-clean model parameters learned from the NAT technique may be used with a Vector Taylor Series (VTS) adaptation. Such adaptation may support decoding noisy utterances during the operating phase of a automatic voice recognition system.
    Type: Grant
    Filed: June 12, 2009
    Date of Patent: April 14, 2015
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Michael Lewis Seltzer, James Garnet Droppo, Ozlem Kalinli, Alejandro Acero
  • Publication number: 20150074027
    Abstract: A deep structured semantic module (DSSM) is described herein which uses a model that is discriminatively trained based on click-through data, e.g., such that a conditional likelihood of clicked documents, given respective queries, is maximized, and a condition likelihood of non-clicked documents, given the queries, is reduced. In operation, after training is complete, the DSSM maps an input item into an output item expressed in a semantic space, using the trained model. To facilitate training and runtime operation, a dimensionality-reduction module (DRM) can reduce the dimensionality of the input item that is fed to the DSSM. A search engine may use the above-summarized functionality to convert a query and a plurality of documents into the common semantic space, and then determine the similarity between the query and documents in the semantic space. The search engine may then rank the documents based, at least in part, on the similarity measures.
    Type: Application
    Filed: September 6, 2013
    Publication date: March 12, 2015
    Applicant: Microsoft Corporation
    Inventors: Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alejandro Acero, Larry P. Heck
  • Patent number: 8965765
    Abstract: Described is a technology by which a structured model of repetition is used to determine the words spoken by a user, and/or a corresponding database entry, based in part on a prior utterance. For a repeated utterance, a joint probability analysis is performed on (at least some of) the corresponding word sequences as recognized by one or more recognizers) and associated acoustic data. For example, a generative probabilistic model, or a maximum entropy model may be used in the analysis. The second utterance may be a repetition of the first utterance using the exact words, or another structural transformation thereof relative to the first utterance, such as an extension that adds one or more words, a truncation that removes one or more words, or a whole or partial spelling of one or more words.
    Type: Grant
    Filed: September 19, 2008
    Date of Patent: February 24, 2015
    Assignee: Microsoft Corporation
    Inventors: Geoffrey G. Zweig, Xiao Li, Dan Bohus, Alejandro Acero, Eric J. Horvitz
  • Patent number: 8942978
    Abstract: Parameters for distributions of a hidden trajectory model including means and variances are estimated using an acoustic likelihood function for observation vectors as an objection function for optimization. The estimation includes only acoustic data and not any intermediate estimate on hidden dynamic variables. Gradient ascent methods can be developed for optimizing the acoustic likelihood function.
    Type: Grant
    Filed: July 14, 2011
    Date of Patent: January 27, 2015
    Assignee: Microsoft Corporation
    Inventors: Li Deng, Dong Yu, Xiaolong Li, Alejandro Acero
  • Publication number: 20140358525
    Abstract: This document describes various techniques for dual-band speech encoding. In some embodiments, a first type of speech feature is received from a remote entity, an estimate of a second type of speech feature is determined based on the first type of speech feature, the estimate of the second type of speech feature is provided to a speech recognizer, speech-recognition results based on the estimate of the second type of speech feature are received from the speech recognizer, and the speech-recognition results are transmitted to the remote entity.
    Type: Application
    Filed: August 14, 2014
    Publication date: December 4, 2014
    Inventors: Alejandro Acero, James G. Droppo, III, Michael L. Seltzer
  • Patent number: 8818797
    Abstract: This document describes various techniques for dual-band speech encoding. In some embodiments, a first type of speech feature is received from a remote entity, an estimate of a second type of speech feature is determined based on the first type of speech feature, the estimate of the second type of speech feature is provided to a speech recognizer, speech-recognition results based on the estimate of the second type of speech feature are received from the speech recognizer, and the speech-recognition results are transmitted to the remote entity.
    Type: Grant
    Filed: December 23, 2010
    Date of Patent: August 26, 2014
    Assignee: Microsoft Corporation
    Inventors: Alejandro Acero, James G. Droppo, III, Michael L. Seltzer
  • Patent number: 8818002
    Abstract: A novel adaptive beamforming technique with enhanced noise suppression capability. The technique incorporates the sound-source presence probability into an adaptive blocking matrix. In one embodiment the sound-source presence probability is estimated based on the instantaneous direction of arrival of the input signals and voice activity detection. The technique guarantees robustness to steering vector errors without imposing ad hoc constraints on the adaptive filter coefficients. It can provide good suppression performance for both directional interference signals as well as isotropic ambient noise.
    Type: Grant
    Filed: July 21, 2011
    Date of Patent: August 26, 2014
    Assignee: Microsoft Corp.
    Inventors: Ivan Tashev, Alejandro Acero, Byung-Jun Yoon
  • Publication number: 20140229158
    Abstract: A system is described herein which uses a neural network having an input layer that accepts an input vector and a feature vector. The input vector represents at least part of input information, such as, but not limited to, a word or phrase in a sequence of input words. The feature vector provides supplemental information pertaining to the input information. The neural network produces an output vector based on the input vector and the feature vector. In one implementation, the neural network is a recurrent neural network. Also described herein are various applications of the system, including a machine translation application.
    Type: Application
    Filed: February 10, 2013
    Publication date: August 14, 2014
    Applicant: MICROSOFT CORPORATION
    Inventors: Geoffrey G. Zweig, Tomas Mikolov, Alejandro Acero