Patents by Inventor Ágoston Weisz

Ágoston Weisz has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Multimedia content management for large language model(s) and/or other generative model(s)

Patent number: 11947923

Abstract: Implementations relate to managing multimedia content that is obtained by large language model(s) (LLM(s)) and/or generated by other generative model(s). Processor(s) of a system can: receive natural language (NL) based input that requests multimedia content, generate a response that is responsive to the NL based input, and cause the response to be rendered. In some implementations, and in generating the response, the processor(s) can process, using a LLM, LLM input to generate LLM output, and determine, based on the LLM output, at least multimedia content to be included in the response. Further, the processor(s) can evaluate the multimedia content to determine whether it should be included in the response. In response to determining that the multimedia content should not be included in the response, the processor(s) can cause the response, including alternative multimedia content or other textual content, to be rendered.

Type: Grant

Filed: November 27, 2023

Date of Patent: April 2, 2024

Assignee: GOOGLE LLC

Inventors: Sanil Jain, Wei Yu, Ágoston Weisz, Michael Andrew Goodman, Diana Avram, Amin Ghafouri, Golnaz Ghiasi, Igor Petrovski, Khyatti Gupta, Oscar Akerlund, Evgeny Sluzhaev, Rakesh Shivanna, Thang Luong, Komal Singh, Yifeng Lu, Vikas Peswani
Generating multi-modal response(s) through utilization of large language model(s)

Patent number: 11907674

Abstract: Implementations relate to generating multi-modal response(s) through utilization of large language model(s) (LLM(s)). Processor(s) of a system can: receive natural language (NL) based input, generate a multi-modal response that is responsive to the NL based output, and cause the multi-modal response to be rendered. In some implementations, and in generating the multi-modal response, the processor(s) can process, using a LLM, LLM input (e.g., that includes at least the NL based input) to generate LLM output, and determine, based on the LLM output, textual content for inclusion in the multi-modal response and multimedia content for inclusion in the multi-modal response. In some implementations, the multimedia content can be obtained based on a multimedia content tag that is included in the LLM output and that is indicative of the multimedia content. In various implementations, the multimedia content can be interleaved between segments of the textual content.

Type: Grant

Filed: September 20, 2023

Date of Patent: February 20, 2024

Assignee: GOOGLE LLC

Inventors: Oscar Akerlund, Evgeny Sluzhaev, Golnaz Ghiasi, Thang Luong, Yifeng Lu, Igor Petrovski, Ágoston Weisz, Wei Yu, Rakesh Shivanna, Michael Andrew Goodman, Apoorv Kulshreshtha, Yu Du, Amin Ghafouri, Sanil Jain, Dustin Tran, Vikas Peswani, YaGuang Li
DYNAMICALLY DETERMINING WHETHER TO PERFORM CANDIDATE AUTOMATED ASSISTANT ACTION DETERMINED FROM SPOKEN UTTERANCE

Publication number: 20240046925

Abstract: Implementations perform, independent of any explicit assistant invocation input(s), automatic speech recognition (ASR) on audio data, that is detected via microphone(s) of an assistant device, to generate ASR text that predicts a spoken utterance that is captured in the audio data. The ASR text is processed and candidate automated assistant action(s) that correspond to the command, if any, are generated. For each of any candidate automated assistant action(s), it is determined whether to (a) cause automatic performance of the automated assistant action responsive to the spoken utterance or, instead, (b) suppress any automatic performance of the automated assistant action responsive to the spoken utterance. Such determination can be made based on processing both (i) action feature(s) for the candidate automated assistant action; and (ii) environment feature(s) that each reflects a corresponding current value for a corresponding dynamic state of an environment of the assistant device.

Type: Application

Filed: September 1, 2022

Publication date: February 8, 2024

Inventors: Konrad Miller, Ágoston Weisz, Herbert Jordan
History-Based ASR Mistake Corrections

Publication number: 20240013782

Abstract: A method includes receiving follow-on audio data captured by an assistant-enabled device, the follow-on audio data corresponding to a follow-on query spoken by a user of the assistant-enabled device to a digital assistant subsequent to the user submitting a previous query to the digital assistant. The method also includes processing, using a speech recognizer, the follow-on audio data to generate multiple candidate hypotheses, each candidate hypothesis corresponding to a candidate transcription for the follow-on query and represented by a respective sequence of hypothesized terms. For each corresponding candidate hypothesis among the multiple candidate hypotheses, the method also includes determining a corresponding similarity metric between the previous query and the corresponding candidate hypothesis and determining a transcription of the follow-on query spoken by the user based on the similarity metrics determined for the multiple candidate hypotheses.

Type: Application

Filed: July 11, 2022

Publication date: January 11, 2024

Applicant: Google LLC

Inventors: Patrick Siegler, Aurélien Boffy, Ágoston Weisz
AUTOMATIC ADJUSTMENT OF MUTED RESPONSE SETTING

Publication number: 20240004608

Abstract: Techniques enable an automatic adjustment of a muted response setting of an automated assistant based on a determination of an expectation by a user to hear an audible response to their query, despite the muted setting. Determination of the expectation may be based on historical, empirical data uploaded from multiple users over time for a given response scenario. For example, the system may determine from the historical data that a certain type of query has been associated with a user both repeating their query and increasing a response volume setting within a given timeframe. Metrics may be generated, stored, and invoked in response to attributes associated with identifiable types of queries and query scenarios. Automated response characteristics meant to reduce inefficiencies may be associated with certain queries that can otherwise collectively burden network bandwidth and processing resources.

Type: Application

Filed: September 18, 2023

Publication date: January 4, 2024

Inventors: Michael Schaer, Vitaly Gatsko, Ágoston Weisz
CORRECTING SPEECH RECOGNITION ERRORS BY CONSIDERING PRIOR USER EDITS AND/OR ASSESSING FULFILLMENT DATA

Publication number: 20230402034

Abstract: Implementations relate to correcting a speech recognition hypothesis based on prior correction(s) made by a user and/or fulfillment data associated with fulfilling a request embodied in the speech recognition hypothesis. A candidate speech recognition hypothesis can be generated in response to the user providing a spoken utterance to an application, such as an automated assistant. When a confidence metric for the candidate speech recognition hypothesis does not satisfy a threshold, one or more terms of the candidate speech recognition hypothesis can be compared to correcting data. The correcting data can indicate whether the user previously corrected any term(s) present in the candidate speech recognition hypothesis and, if so, correct the term(s) accordingly. Fulfillment data generated for the candidate hypothesis and/or for the corrected hypothesis can also be processed to determine whether to utilize the candidate hypothesis or the corrected hypothesis in responding to the user.

Type: Application

Filed: June 16, 2022

Publication date: December 14, 2023

Inventors: Ágoston Weisz, Miroslaw Michalski, Aurélien Boffy
Automatic adjustment of muted response setting

Patent number: 11789695

Abstract: Techniques enable an automatic adjustment of a muted response setting of an automated assistant based on a determination of an expectation by a user to hear an audible response to their query, despite the muted setting. Determination of the expectation may be based on historical, empirical data uploaded from multiple users over time for a given response scenario. For example, the system may determine from the historical data that a certain type of query has been associated with a user both repeating their query and increasing a response volume setting within a given timeframe. Metrics may be generated, stored, and invoked in response to attributes associated with identifiable types of queries and query scenarios. Automated response characteristics meant to reduce inefficiencies may be associated with certain queries that can otherwise collectively burden network bandwidth and processing resources.

Type: Grant

Filed: October 13, 2022

Date of Patent: October 17, 2023

Assignee: GOOGLE LLC

Inventors: Michael Schaer, Vitaly Gatsko, Ágoston Weisz
Lattice Speech Corrections

Publication number: 20230186898

Abstract: A method includes receiving audio data corresponding to a query spoken and processing the audio data to generate multiple candidate hypotheses each represented by a respective sequence of hypothesized terms. For each candidate hypothesis, the method includes determining whether the sequence of hypothesized terms includes a source phrase from a list of phrase correction pairs. Each phrase correction pair includes a corresponding source phrase that was misrecognized and a corresponding target phrase replacing the source phrase. When the respective sequence of hypothesized terms includes the source phrase, the method includes generating a corresponding additional candidate hypothesis that replaces the source phrase.

Type: Application

Filed: December 15, 2021

Publication date: June 15, 2023

Applicant: Google LLC

Inventors: Ágoston Weisz, Leonid Velikovich
AUTOMATIC ADJUSTMENT OF MUTED RESPONSE SETTING

Publication number: 20230033396

Abstract: Techniques enable an automatic adjustment of a muted response setting of an automated assistant based on a determination of an expectation by a user to hear an audible response to their query, despite the muted setting. Determination of the expectation may be based on historical, empirical data uploaded from multiple users over time for a given response scenario. For example, the system may determine from the historical data that a certain type of query has been associated with a user both repeating their query and increasing a response volume setting within a given timeframe. Metrics may be generated, stored, and invoked in response to attributes associated with identifiable types of queries and query scenarios. Automated response characteristics meant to reduce inefficiencies may be associated with certain queries that can otherwise collectively burden network bandwidth and processing resources.

Type: Application

Filed: October 13, 2022

Publication date: February 2, 2023

Inventors: Michael Schaer, Vitaly Gatsko, Ágoston Weisz
Automatic adjustment of muted response setting

Patent number: 11474773

Abstract: Techniques enable an automatic adjustment of a muted response setting of an automated assistant based on a determination of an expectation by a user to hear an audible response to their query, despite the muted setting. Determination of the expectation may be based on historical, empirical data uploaded from multiple users over time for a given response scenario. For example, the system may determine from the historical data that a certain type of query has been associated with a user both repeating their query and increasing a response volume setting within a given timeframe. Metrics may be generated, stored, and invoked in response to attributes associated with identifiable types of queries and query scenarios. Automated response characteristics meant to reduce inefficiencies may be associated with certain queries that can otherwise collectively burden network bandwidth and processing resources.

Type: Grant

Filed: September 2, 2020

Date of Patent: October 18, 2022

Assignee: GOOGLE LLC

Inventors: Michael Schaer, Vitaly Gatsko, Ágoston Weisz
Automatically determining language for speech recognition of spoken utterance received via an automated assistant interface

Patent number: 11393476

Abstract: Implementations relate to determining a language for speech recognition of a spoken utterance, received via an automated assistant interface, for interacting with an automated assistant. In various implementations, audio data indicative of a voice input that includes a natural language request from a user may be applied as input across multiple speech-to-text (“STT”) machine learning models to generate multiple candidate speech recognition outputs. Each STT machine learning model may trained in a particular language. For each respective STT machine learning model of the multiple STT models, the multiple candidate speech recognition outputs may be analyzed to determine an entropy score for the respective STT machine learning model. Based on the entropy scores, a target language associated with at least one STT machine learning model of the multiple STT machine learning models may be selected. The automated assistant may respond to the request using the target language.

Type: Grant

Filed: January 8, 2019

Date of Patent: July 19, 2022

Assignee: GOOGLE LLC

Inventors: Ignacio Lopez Moreno, Lukas Lopatovsky, Ágoston Weisz
IDENTIFICATION AND UTILIZATION OF MISRECOGNITIONS IN AUTOMATIC SPEECH RECOGNITION

Publication number: 20220139373

Abstract: Techniques are disclosed that enable determining and/or utilizing a misrecognition of a spoken utterance, where the misrecognition is generated using an automatic speech recognition (ASR) model. Various implementations include determining a recognition based on the spoken utterance and a previous utterance spoken prior to the spoken utterance. Additionally or alternatively, implementations include personalizing an ASR engine for a user based on the spoken utterance and the previous utterance spoken prior to the spoken utterance (e.g., based on audio data capturing the previous utterance and a text representation of the spoken utterance).

Type: Application

Filed: July 8, 2020

Publication date: May 5, 2022

Inventors: Ágoston Weisz, Ignacio Lopez Moreno, Alexandru Dovlecel
SPEECH RECOGNITION HYPOTHESIS GENERATION ACCORDING TO PREVIOUS OCCURRENCES OF HYPOTHESES TERMS AND/OR CONTEXTUAL DATA

Publication number: 20220084503

Abstract: Implementations set forth herein relate to speech recognition techniques for handling variations in speech among users (e.g. due to different accents) and processing features of user context in order to expand a number of speech recognition hypotheses when interpreting a spoken utterance from a user. In order to adapt to an accent of the user, terms common to multiple speech recognition hypotheses can be filtered out in order to identify inconsistent terms apparent in a group of hypotheses. Mappings between inconsistent terms can be stored for subsequent users as term correspondence data. In this way, supplemental speech recognition hypotheses can be generated and subject to probability-based scoring for identifying a speech recognition hypothesis that most correlates to a spoken utterance provided by a user. In some implementations, prior to scoring, hypotheses can be supplemented based on contextual data, such as on-screen content and/or application capabilities.

Type: Application

Filed: November 29, 2021

Publication date: March 17, 2022

Inventors: Ágoston Weisz, Alexandru Dovlecel, Gleb Skobeltsyn, Evgeny Cherepanov, Justas Klimavicius, Yihui Ma, Lukas Lopatovsky
AUTOMATIC ADJUSTMENT OF MUTED RESPONSE SETTING

Publication number: 20220066731

Abstract: Techniques enable an automatic adjustment of a muted response setting of an automated assistant based on a determination of an expectation by a user to hear an audible response to their query, despite the muted setting. Determination of the expectation may be based on historical, empirical data uploaded from multiple users over time for a given response scenario. For example, the system may determine from the historical data that a certain type of query has been associated with a user both repeating their query and increasing a response volume setting within a given timeframe. Metrics may be generated, stored, and invoked in response to attributes associated with identifiable types of queries and query scenarios. Automated response characteristics meant to reduce inefficiencies may be associated with certain queries that can otherwise collectively burden network bandwidth and processing resources.

Type: Application

Filed: September 2, 2020

Publication date: March 3, 2022

Inventors: Michael Schaer, Vitaly Gatsko, Ágoston Weisz
Speech recognition hypothesis generation according to previous occurrences of hypotheses terms and/or contextual data

Patent number: 11189264

Abstract: Implementations set forth herein relate to speech recognition techniques for handling variations in speech among users (e.g. due to different accents) and processing features of user context in order to expand a number of speech recognition hypotheses when interpreting a spoken utterance from a user. In order to adapt to an accent of the user, terms common to multiple speech recognition hypotheses can be filtered out in order to identify inconsistent terms apparent in a group of hypotheses. Mappings between inconsistent terms can be stored for subsequent users as term correspondence data. In this way, supplemental speech recognition hypotheses can be generated and subject to probability-based scoring for identifying a speech recognition hypothesis that most correlates to a spoken utterance provided by a user. In some implementations, prior to scoring, hypotheses can be supplemented based on contextual data, such as on-screen content and/or application capabilities.

Type: Grant

Filed: July 17, 2019

Date of Patent: November 30, 2021

Assignee: GOOGLE LLC

Inventors: Ágoston Weisz, Alexandru Dovlecel, Gleb Skobeltsyn, Evgeny Cherepanov, Justas Klimavicius, Yihui Ma, Lukas Lopatovsky
AUTOMATICALLY DETERMINING LANGUAGE FOR SPEECH RECOGNITION OF SPOKEN UTTERANCE RECEIVED VIA AN AUTOMATED ASSISTANT INTERFACE

Publication number: 20210074295

Abstract: Implementations relate to determining a language for speech recognition of a spoken utterance, received via an automated assistant interface, for interacting with an automated assistant. In various implementations, audio data indicative of a voice input that includes a natural language request from a user may be applied as input across multiple speech-to-text (“STT”) machine learning models to generate multiple candidate speech recognition outputs. Each STT machine learning model may trained in a particular language. For each respective STT machine learning model of the multiple STT models, the multiple candidate speech recognition outputs may be analyzed to determine an entropy score for the respective STT machine learning model. Based on the entropy scores, a target language associated with at least one STT machine learning model of the multiple STT machine learning models may be selected. The automated assistant may respond to the request using the target language.

Type: Application

Filed: January 8, 2019

Publication date: March 11, 2021

Inventors: Ignacio Lopez Moreno, Lukas Lopatovsky, Ágoston Weisz
SPEECH RECOGNITION HYPOTHESIS GENERATION ACCORDING TO PREVIOUS OCCURRENCES OF HYPOTHESES TERMS AND/OR CONTEXTUAL DATA

Publication number: 20210012765

Abstract: Implementations set forth herein relate to speech recognition techniques for handling variations in speech among users (e.g. due to different accents) and processing features of user context in order to expand a number of speech recognition hypotheses when interpreting a spoken utterance from a user. In order to adapt to an accent of the user, terms common to multiple speech recognition hypotheses can be filtered out in order to identify inconsistent terms apparent in a group of hypotheses. Mappings between inconsistent terms can be stored for subsequent users as term correspondence data. In this way, supplemental speech recognition hypotheses can be generated and subject to probability-based scoring for identifying a speech recognition hypothesis that most correlates to a spoken utterance provided by a user. In some implementations, prior to scoring, hypotheses can be supplemented based on contextual data, such as on-screen content and/or application capabilities.

Type: Application

Filed: July 17, 2019

Publication date: January 14, 2021

Inventors: Ágoston Weisz, Alexandru Dovlecel, Gleb Skobeltsyn, Evgeny Cherepanov, Justas Klimavicius, Yihui Ma, Lukas Lopatovsky
Systems and methods for aligning event data

Patent number: 10853435

Abstract: System and methods for aligning event data recorded by recording devices. Recording devices create, transmit, and store alignment data. Alignment data created by a recording device is stored in the memory of the recording device with a time that is maintained by the recording device and that is relative to the time of event data recorded by the recording device that creates the alignment data. Recording devices further receive and store transmitted alignment data. Alignment data received by a recording device is stored in the memory of the recording device with a time that is maintained by the receiving recording device and that is relative to the time of event data recorded by the recording device that creates alignment data. Stored alignment data may be used to align the playback of event data of devices that have the same alignment data.

Type: Grant

Filed: June 9, 2017

Date of Patent: December 1, 2020

Assignee: Axon Enterprise, Inc.

Inventors: James Norton Reitz, Raymond T. Fortna, Nathan A. Grubb, Michael J. Bohlander, Tyler J. Conant, Tamas Agoston Weisz, Zachary S. Emmel, Trevin Chow, Melissa S. Kersh, Jacob Davis Hershfield, Patrick W. Smith, Abraham Alvarez Zayas
Systems and Methods for Aligning Event Data

Publication number: 20170364602

Abstract: System and methods for aligning event data recorded by recording devices. Recording devices create, transmit, and store alignment data. Alignment data created by a recording device is stored in the memory of the recording device with a time that is maintained by the recording device and that is relative to the time of event data recorded by the recording device that creates the alignment data. Recording devices further receive and store transmitted alignment data. Alignment data received by a recording device is stored in the memory of the recording device with a time that is maintained by the receiving recording device and that is relative to the time of event data recorded by the recording device that creates alignment data. Stored alignment data may be used to align the playback of event data of devices that have the same alignment data.

Type: Application

Filed: June 9, 2017

Publication date: December 21, 2017

Applicant: Axon Enterprise, Inc.

Inventors: James Norton Reitz, Raymond T. Fortna, Nathan A. Grubb, Michael J. Bohlander, Tyler J. Conant, Tamas Agoston Weisz, Zachary S. Emmel, Trevin Chow, Melissa S. Kersh, Jacob Davis Hershfield, Patrick W. Smith, Abraham Alvarez Zayas