Patents by Inventor Alejandro Acero

Alejandro Acero has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11862151
    Abstract: Systems and processes for operating a digital assistant are provided. In an example process, low-latency operation of a digital assistant is provided. In this example, natural language processing, task flow processing, dialogue flow processing, speech synthesis, or any combination thereof can be at least partially performed while awaiting detection of a speech end-point condition. Upon detection of a speech end-point condition, results obtained from performing the operations can be presented to the user. In another example, robust operation of a digital assistant is provided. In this example, task flow processing by the digital assistant can include selecting a candidate task flow from a plurality of candidate task flows based on determined task flow scores. The task flow scores can be based on speech recognition confidence scores, intent confidence scores, flow parameter scores, or any combination thereof. The selected candidate task flow is executed and corresponding results presented to the user.
    Type: Grant
    Filed: November 16, 2022
    Date of Patent: January 2, 2024
    Assignee: Apple Inc.
    Inventors: Alejandro Acero, Hepeng Zhang
  • Publication number: 20230072481
    Abstract: Systems and processes for operating a digital assistant are provided. In an example process, low-latency operation of a digital assistant is provided. In this example, natural language processing, task flow processing, dialogue flow processing, speech synthesis, or any combination thereof can be at least partially performed while awaiting detection of a speech end-point condition. Upon detection of a speech end-point condition, results obtained from performing the operations can be presented to the user. In another example, robust operation of a digital assistant is provided. In this example, task flow processing by the digital assistant can include selecting a candidate task flow from a plurality of candidate task flows based on determined task flow scores. The task flow scores can be based on speech recognition confidence scores, intent confidence scores, flow parameter scores, or any combination thereof. The selected candidate task flow is executed and corresponding results presented to the user.
    Type: Application
    Filed: November 16, 2022
    Publication date: March 9, 2023
    Inventors: Alejandro ACERO, Hepeng ZHANG
  • Patent number: 11538469
    Abstract: Systems and processes for operating a digital assistant are provided. In an example process, low-latency operation of a digital assistant is provided. In this example, natural language processing, task flow processing, dialogue flow processing, speech synthesis, or any combination thereof can be at least partially performed while awaiting detection of a speech end-point condition. Upon detection of a speech end-point condition, results obtained from performing the operations can be presented to the user. In another example, robust operation of a digital assistant is provided. In this example, task flow processing by the digital assistant can include selecting a candidate task flow from a plurality of candidate task flows based on determined task flow scores. The task flow scores can be based on speech recognition confidence scores, intent confidence scores, flow parameter scores, or any combination thereof. The selected candidate task flow is executed and corresponding results presented to the user.
    Type: Grant
    Filed: April 27, 2022
    Date of Patent: December 27, 2022
    Assignee: Apple Inc.
    Inventors: Alejandro Acero, Hepeng Zhang
  • Publication number: 20220254339
    Abstract: Systems and processes for operating a digital assistant are provided. In an example process, low-latency operation of a digital assistant is provided. In this example, natural language processing, task flow processing, dialogue flow processing, speech synthesis, or any combination thereof can be at least partially performed while awaiting detection of a speech end-point condition. Upon detection of a speech end-point condition, results obtained from performing the operations can be presented to the user. In another example, robust operation of a digital assistant is provided. In this example, task flow processing by the digital assistant can include selecting a candidate task flow from a plurality of candidate task flows based on determined task flow scores. The task flow scores can be based on speech recognition confidence scores, intent confidence scores, flow parameter scores, or any combination thereof. The selected candidate task flow is executed and corresponding results presented to the user.
    Type: Application
    Filed: April 27, 2022
    Publication date: August 11, 2022
    Inventors: Alejandro ACERO, Hepeng ZHANG
  • Patent number: 11380310
    Abstract: Systems and processes for operating a digital assistant are provided. In an example process, low-latency operation of a digital assistant is provided. In this example, natural language processing, task flow processing, dialogue flow processing, speech synthesis, or any combination thereof can be at least partially performed while awaiting detection of a speech end-point condition. Upon detection of a speech end-point condition, results obtained from performing the operations can be presented to the user. In another example, robust operation of a digital assistant is provided. In this example, task flow processing by the digital assistant can include selecting a candidate task flow from a plurality of candidate task flows based on determined task flow scores. The task flow scores can be based on speech recognition confidence scores, intent confidence scores, flow parameter scores, or any combination thereof. The selected candidate task flow is executed and corresponding results presented to the user.
    Type: Grant
    Filed: August 20, 2020
    Date of Patent: July 5, 2022
    Assignee: Apple Inc.
    Inventors: Alejandro Acero, Hepeng Zhang
  • Publication number: 20200380966
    Abstract: Systems and processes for operating a digital assistant are provided. In an example process, low-latency operation of a digital assistant is provided. In this example, natural language processing, task flow processing, dialogue flow processing, speech synthesis, or any combination thereof can be at least partially performed while awaiting detection of a speech end-point condition. Upon detection of a speech end-point condition, results obtained from performing the operations can be presented to the user. In another example, robust operation of a digital assistant is provided. In this example, task flow processing by the digital assistant can include selecting a candidate task flow from a plurality of candidate task flows based on determined task flow scores. The task flow scores can be based on speech recognition confidence scores, intent confidence scores, flow parameter scores, or any combination thereof. The selected candidate task flow is executed and corresponding results presented to the user.
    Type: Application
    Filed: August 20, 2020
    Publication date: December 3, 2020
    Inventors: Alejandro ACERO, Hepeng ZHANG
  • Patent number: 10789945
    Abstract: Systems and processes for operating a digital assistant are provided. In an example process, low-latency operation of a digital assistant is provided. In this example, natural language processing, task flow processing, dialogue flow processing, speech synthesis, or any combination thereof can be at least partially performed while awaiting detection of a speech end-point condition. Upon detection of a speech end-point condition, results obtained from performing the operations can be presented to the user. In another example, robust operation of a digital assistant is provided. In this example, task flow processing by the digital assistant can include selecting a candidate task flow from a plurality of candidate task flows based on determined task flow scores. The task flow scores can be based on speech recognition confidence scores, intent confidence scores, flow parameter scores, or any combination thereof. The selected candidate task flow is executed and corresponding results presented to the user.
    Type: Grant
    Filed: August 17, 2017
    Date of Patent: September 29, 2020
    Assignee: Apple Inc.
    Inventors: Alejandro Acero, Hepeng Zhang
  • Publication number: 20180330723
    Abstract: Systems and processes for operating a digital assistant are provided. In an example process, low-latency operation of a digital assistant is provided. In this example, natural language processing, task flow processing, dialogue flow processing, speech synthesis, or any combination thereof can be at least partially performed while awaiting detection of a speech end-point condition. Upon detection of a speech end-point condition, results obtained from performing the operations can be presented to the user. In another example, robust operation of a digital assistant is provided. In this example, task flow processing by the digital assistant can include selecting a candidate task flow from a plurality of candidate task flows based on determined task flow scores. The task flow scores can be based on speech recognition confidence scores, intent confidence scores, flow parameter scores, or any combination thereof. The selected candidate task flow is executed and corresponding results presented to the user.
    Type: Application
    Filed: August 17, 2017
    Publication date: November 15, 2018
    Inventors: Alejandro ACERO, Hepeng ZHANG
  • Patent number: 10055686
    Abstract: A deep structured semantic module (DSSM) is described herein which uses a model that is discriminatively trained based on click-through data, e.g., such that a conditional likelihood of clicked documents, given respective queries, is maximized, and a condition likelihood of non-clicked documents, given the queries, is reduced. In operation, after training is complete, the DSSM maps an input item into an output item expressed in a semantic space, using the trained model. To facilitate training and runtime operation, a dimensionality-reduction module (DRM) can reduce the dimensionality of the input item that is fed to the DSSM. A search engine may use the above-summarized functionality to convert a query and a plurality of documents into the common semantic space, and then determine the similarity between the query and documents in the semantic space. The search engine may then rank the documents based, at least in part, on the similarity measures.
    Type: Grant
    Filed: July 12, 2016
    Date of Patent: August 21, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alejandro Acero, Larry P. Heck
  • Patent number: 9984678
    Abstract: Various technologies described herein pertain to adapting a speech recognizer to input speech data. A first linear transform can be selected from a first set of linear transforms based on a value of a first variability source corresponding to the input speech data, and a second linear transform can be selected from a second set of linear transforms based on a value of a second variability source corresponding to the input speech data. The linear transforms in the first and second sets can compensate for the first variability source and the second variability source, respectively. Moreover, the first linear transform can be applied to the input speech data to generate intermediate transformed speech data, and the second linear transform can be applied to the intermediate transformed speech data to generate transformed speech data. Further, speech can be recognized based on the transformed speech data to obtain a result.
    Type: Grant
    Filed: March 23, 2012
    Date of Patent: May 29, 2018
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Michael Lewis Seltzer, Alejandro Acero
  • Patent number: 9928296
    Abstract: One or more techniques and/or systems are disclosed for creating an expanded or improved lexicon for use in search-based semantic tagging. A set of first documents can be identified using a set of first lexicon elements as queries, and one or more first document patterns can be extracted from the set of first documents. The document patterns can be used to find one or more second documents in a query log that comprise the document patterns, which are associated with query terms used to return the second documents. The query terms for the second documents can be extracted and used to expand the lexicon. Elements within the lexicon may be weighted based upon relevance to different query domains, for example.
    Type: Grant
    Filed: December 16, 2010
    Date of Patent: March 27, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Xiao Li, Jingjing Liu, Alejandro Acero, Ye-Yi Wang
  • Patent number: 9786284
    Abstract: This document describes various techniques for dual-band speech encoding. In some embodiments, a first type of speech feature is received from a remote entity, an estimate of a second type of speech feature is determined based on the first type of speech feature, the estimate of the second type of speech feature is provided to a speech recognizer, speech-recognition results based on the estimate of the second type of speech feature are received from the speech recognizer, and the speech-recognition results are transmitted to the remote entity.
    Type: Grant
    Filed: August 14, 2014
    Date of Patent: October 10, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Alejandro Acero, James G. Droppo, III, Michael L. Seltzer
  • Patent number: 9684741
    Abstract: A query may be applied against search engines that respectively return a set of search results relating to various items discovered in the searched data sets. However, presenting numerous and varied search results may be difficult on mobile devices with small displays and limited computational resources. Instead, search results may be associated with search domains representing various information types (e.g., contacts, public figures, places, projects, movies, music, and books) and presented by grouping search results with associated query domains, e.g., in a tabbed user interface. The query may be received through an input device associated with a particular input domain, and may be transitioned to the query domain of a particular search engine (e.g., by recognizing phonemes of a voice query using an acoustic model; matching phonemes with query terms according to a pronunciation model; and generating a recognition result according to a vocabulary of an n-gram language model.
    Type: Grant
    Filed: June 5, 2009
    Date of Patent: June 20, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Xiao Li, Patrick Nguyen, Geoffrey Zweig, Alejandro Acero
  • Patent number: 9519859
    Abstract: A deep structured semantic module (DSSM) is described herein which uses a model that is discriminatively trained based on click-through data, e.g., such that a conditional likelihood of clicked documents, given respective queries, is maximized, and a condition likelihood of non-clicked documents, given the queries, is reduced. In operation, after training is complete, the DSSM maps an input item into an output item expressed in a semantic space, using the trained model. To facilitate training and runtime operation, a dimensionality-reduction module (DRM) can reduce the dimensionality of the input item that is fed to the DSSM. A search engine may use the above-summarized functionality to convert a query and a plurality of documents into the common semantic space, and then determine the similarity between the query and documents in the semantic space. The search engine may then rank the documents based, at least in part, on the similarity measures.
    Type: Grant
    Filed: September 6, 2013
    Date of Patent: December 13, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alejandro Acero, Larry P. Heck
  • Publication number: 20160321321
    Abstract: A deep structured semantic module (DSSM) is described herein which uses a model that is discriminatively trained based on click-through data, e.g., such that a conditional likelihood of clicked documents, given respective queries, is maximized, and a condition likelihood of non-clicked documents, given the queries, is reduced. In operation, after training is complete, the DSSM maps an input item into an output item expressed in a semantic space, using the trained model. To facilitate training and runtime operation, a dimensionality-reduction module (DRM) can reduce the dimensionality of the input item that is fed to the DSSM. A search engine may use the above-summarized functionality to convert a query and a plurality of documents into the common semantic space, and then determine the similarity between the query and documents in the semantic space. The search engine may then rank the documents based, at least in part, on the similarity measures.
    Type: Application
    Filed: July 12, 2016
    Publication date: November 3, 2016
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alejandro Acero, Larry P. Heck
  • Patent number: 9390371
    Abstract: A method is disclosed herein that includes an act of causing a processor to access a deep-structured, layered or hierarchical model, called a deep convex network, retained in a computer-readable medium, wherein the deep-structured model comprises a plurality of layers with weights assigned thereto. This layered model can produce the output serving as the scores to combine with transition probabilities between states in a hidden Markov model and language model scores to form a full speech recognizer. Batch-based, convex optimization is performed to learn a portion of the deep convex network's weights, rendering it appropriate for parallel computation to accomplish the training. The method can further include the act of jointly substantially optimizing the weights, the transition probabilities, and the language model scores of the deep-structured model using the optimization criterion based on a sequence rather than a set of unrelated frames.
    Type: Grant
    Filed: June 17, 2013
    Date of Patent: July 12, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Li Deng, Dong Yu, Alejandro Acero
  • Patent number: 9264807
    Abstract: A multichannel acoustic echo reduction system is described herein. The system includes an acoustic echo canceller (AEC) component having a fixed filter for each respective combination of loudspeaker and microphone signals and having an adaptive filter for each microphone signal. For each microphone signal, the AEC component modifies the microphone signal to reduce contributions from the outputs of the loudspeakers based at least in part on the respective adaptive filter associated with the microphone signal and the set of fixed filters associated with the respective microphone signal.
    Type: Grant
    Filed: January 23, 2013
    Date of Patent: February 16, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Ivan Jelev Tashev, Alejandro Acero, Nilesh Madhu
  • Patent number: 9218412
    Abstract: A database having listings rather than long documents is searched using a term frequency-inverse document frequency (Tf/Idf) algorithm.
    Type: Grant
    Filed: May 10, 2007
    Date of Patent: December 22, 2015
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Ye-Yi Wang, Dong Yu, Yun-Cheng Ju, Alejandro Acero, Geoffrey G. Zweig
  • Patent number: 9054764
    Abstract: A novel beamforming post-processor technique with enhanced noise suppression capability. The present beamforming post-processor technique is a non-linear post-processing technique for sensor arrays (e.g., microphone arrays) which improves the directivity and signal separation capabilities. The technique works in so-called instantaneous direction of arrival space, estimates the probability for sound coming from a given incident angle or look-up direction and applies a time-varying, gain based, spatio-temporal filter for suppressing sounds coming from directions other than the sound source direction, resulting in minimal artifacts and musical noise.
    Type: Grant
    Filed: July 20, 2011
    Date of Patent: June 9, 2015
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Ivan Tashev, Alejandro Acero
  • Patent number: 9009039
    Abstract: Technologies are described herein for noise adaptive training to achieve robust automatic speech recognition. Through the use of these technologies, a noise adaptive training (NAT) approach may use both clean and corrupted speech for training. The NAT approach may normalize the environmental distortion as part of the model training. A set of underlying “pseudo-clean” model parameters may be estimated directly. This may be done without point estimation of clean speech features as an intermediate step. The pseudo-clean model parameters learned from the NAT technique may be used with a Vector Taylor Series (VTS) adaptation. Such adaptation may support decoding noisy utterances during the operating phase of a automatic voice recognition system.
    Type: Grant
    Filed: June 12, 2009
    Date of Patent: April 14, 2015
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Michael Lewis Seltzer, James Garnet Droppo, Ozlem Kalinli, Alejandro Acero