Patents by Inventor Matthew Sharifi

Matthew Sharifi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11776549
    Abstract: Techniques are described herein for multi-factor audio watermarking. A method includes: receiving audio data; processing the audio data to generate predicted output that indicates a probability of one or more hotwords being present in the audio data; determining that the predicted output satisfies a threshold that is indicative of the one or more hotwords being present in the audio data; in response to determining that the predicted output satisfies the threshold, processing the audio data using automatic speech recognition to generate a speech transcription feature; detecting a watermark that is embedded in the audio data; and in response to detecting the watermark: determining that the speech transcription feature corresponds to one of a plurality of stored speech transcription features; and in response to determining that the speech transcription feature corresponds to one of the plurality of stored speech transcription features, suppressing processing of a query included in the audio data.
    Type: Grant
    Filed: December 7, 2020
    Date of Patent: October 3, 2023
    Assignee: GOOGLE LLC
    Inventors: Aleks Kracun, Matthew Sharifi
  • Patent number: 11775324
    Abstract: Automated content switching rules may be generated and/or utilized for automatically switching away from certain interactive content during presentation of that interactive content when one or more switch conditions are met. In some instances, automated content switching rules may define one or more non-temporal switch conditions, e.g., based upon reaching certain points or milestones in certain interactive content, that may be used to initiate actions that switch away from the interactive content. In addition, in some instances, automated content switching rules may be used to not only switch away from particular interactive content, but additionally switch to other interactive content, thereby enabling a user to effectively schedule a workflow across different interactive content, applications and/or other computer-related tasks.
    Type: Grant
    Filed: February 7, 2022
    Date of Patent: October 3, 2023
    Assignee: GOOGLE LLC
    Inventors: Victor Carbune, Matthew Sharifi
  • Publication number: 20230298588
    Abstract: A method includes receiving audio data corresponding to an utterance spoken by the user and captured by the user device. The utterance includes a command for a digital assistant to perform an operation. The method also includes determining, using a hotphrase detector configured to detect each trigger word in a set of trigger words associated with a hotphrase, whether any of the trigger words in the set of trigger words are detected in the audio data during the corresponding fixed-duration time window. The method also includes determining identifying, in the audio corresponding to the utterance, the hotphrase when each other trigger word in the set of trigger words was also detected in the audio data. The method also includes triggering an automated speech recognizer to perform speech recognition on the audio data when the hotphrase is identified in the audio data corresponding to the utterance.
    Type: Application
    Filed: May 25, 2023
    Publication date: September 21, 2023
    Applicant: Google LLC
    Inventors: Victor Carbune, Matthew Sharifi
  • Publication number: 20230298575
    Abstract: A method for detecting freeze words includes receiving audio data that corresponds to an utterance spoken by a user and captured by a user device associated with the user. The method also includes processing, using a speech recognizer, the audio data to determine that the utterance includes a query for a digital assistant to perform an operation. The speech recognizer is configured to trigger endpointing of the utterance after a predetermined duration of non-speech in the audio data. Before the predetermined duration of non-speech, the method includes detecting a freeze word in the audio data. In response to detecting the freeze word in the audio data, the method also includes triggering a hard microphone closing event at the user device. The hard microphone closing event prevents the user device from capturing any audio subsequent to the freeze word.
    Type: Application
    Filed: May 23, 2023
    Publication date: September 21, 2023
    Applicant: Google LLC
    Inventors: Matthew Sharifi, Aleksandar Kracun
  • Publication number: 20230298583
    Abstract: Implementations set forth relate to suggesting an alternate interface modality when an automated assistant and/or a user is expected to not understand a particular interaction between the user and the automated assistant. In some instances, the automated assistant can pre-emptively determine that a forthcoming and/or ongoing interaction between a user and an automated assistant may experience interference. Based on this determination, the automated assistant can provide an indication that the interaction may not be successful and/or that the user should interact with the automated assistant through a different modality. For example, the automated assistant can render a keyboard interface at a portable computing device when the automated assistant determines that an audio interface of the portable computing device is experiencing interference.
    Type: Application
    Filed: May 22, 2023
    Publication date: September 21, 2023
    Inventors: Matthew Sharifi, Victor Carbune
  • Patent number: 11762848
    Abstract: Methods, systems, and computer readable media related to generating a combined search query based on search parameters of a current search query of a user and search parameters of one or more previously submitted search quer(ies) of the user that are determined to be of the same line of inquiry as the current search query. Two or more search queries may be determined to share a line of inquiry when it is determined that they are within a threshold level of semantic similarity to one another. Once a shared line of inquiry has been identified and a combined search query generated, users may interact with the search parameters and/or the search results to update the search parameters of the combined search query.
    Type: Grant
    Filed: September 6, 2022
    Date of Patent: September 19, 2023
    Assignee: GOOGLE LLC
    Inventors: Matthew Sharifi, Victor Carbune
  • Patent number: 11765452
    Abstract: Implementations set forth herein relate to an automated assistant that can control a camera according to one or more conditions specified by a user. A condition can be satisfied when, for example, the automated assistant detects a particular environment feature is apparent. In this way, the user can rely on the automated assistant to identify and capture certain moments without necessarily requiring the user to constantly monitor a viewing window of the camera. In some implementations, a condition for the automated assistant to capture media data can be based on application data and/or other contextual data that is associated with the automated assistant. For instance, a relationship between content in a camera viewing window and other content of an application interface can be a condition upon which the automated assistant captures certain media data using a camera.
    Type: Grant
    Filed: January 13, 2023
    Date of Patent: September 19, 2023
    Assignee: GOOGLE LLC
    Inventors: Felix Weissenberger, Balint Miklos, Victor Carbune, Matthew Sharifi, Domenico Carbotta, Ray Chen, Kevin Fu, Bogdan Prisacari, Fo Lee, Mucun Lu, Neha Garg, Jacopo Sannazzaro Natta, Barbara Poblocka, Jae Seo, Matthew Miao, Thomas Qian, Luv Kothari
  • Patent number: 11756544
    Abstract: Implementations described herein receive audio data that captures a spoken utterance, generate, based on processing the audio data, a recognition that corresponds to the spoken utterance, and determine, based on processing the recognition, that the spoken utterance is ambiguous (i.e., is interpretable as requesting performance of a first particular action exclusively and is also interpretable a second particular action exclusively). In response to determining that the spoken utterance is ambiguous, implementations determine to provide an enhanced clarification prompt that renders output that is in addition to natural language. The enhanced clarification prompt solicits further user interface input for disambiguating between the first particular action and the second particular action.
    Type: Grant
    Filed: December 15, 2020
    Date of Patent: September 12, 2023
    Assignee: GOOGLE LLC
    Inventors: Matthew Sharifi, Victor Carbune
  • Patent number: 11756530
    Abstract: Example embodiments relate to techniques for training artificial neural networks or oilier machine-learning encoders to accurately predict the pitch of input audio samples in a semitone or otherwise logarithmically-scaled pitch space. An example method may include generating, from a sample of audio data, two training samples by applying two different pitch shifts to the sample of audio training data. This can be done by converting the sample of audio data into the frequency domain and then shifting the transformed data. These known shifts are then compared to the predicted pitches generated by applying the two training samples to the encoder. The encoder is then updated based on the comparison, such that the relative pitch output by the encoder is improved with respect to accuracy. One or more audio samples, labeled with absolute pitch values, can then be used to calibrate the relative pitch values generated by the trained encoder.
    Type: Grant
    Filed: September 25, 2020
    Date of Patent: September 12, 2023
    Assignee: Google LLC
    Inventors: Marco Tagliasacchi, Mihajlo Velimirovic, Matthew Sharifi, Dominik Roblek, Christian Frank, Beat Gfeller
  • Patent number: 11749267
    Abstract: A method for adapting hotword recognition includes receiving audio data characterizing a hotword event detected by a first stage hotword detector in streaming audio captured by a user device. The method also includes processing, using a second stage hotword detector, the audio data to determine whether a hotword is detected by the second stage hot word detector in a first segment of the audio data. When the hotword is not detected by the second stage hotword detector, the method includes, classifying the first segment of the audio data as containing a negative hotword that caused a false detection of the hotword event in the streaming audio by the first stage hotword detector. Based on the first segment of the audio data classified as containing the negative hotword, the method includes updating the first stage hotword detector to prevent triggering the hotword event in subsequent audio data that contains the negative hotword.
    Type: Grant
    Filed: November 20, 2020
    Date of Patent: September 5, 2023
    Assignee: Google LLC
    Inventors: Aleksandar Kracun, Matthew Sharifi
  • Patent number: 11749284
    Abstract: Implementations are directed to dynamically adapting which assistant on-device model(s) are locally stored at assistant devices of an assistant device group and/or dynamically adapting the assistant processing role(s) of the assistant device(s) of the assistant device group. In some of those implementations, the corresponding on-device model(s) and/or corresponding processing role(s), for each of the assistant devices of the group, is determined based on collectively considering individual processing capabilities of the assistant devices of the group. Implementations are additionally or alternatively directed to cooperatively utilizing assistant devices of a group, and their associated post-adaptation on-device model(s) and/or post-adaptation processing role(s), in cooperatively processing assistant requests that are directed to any one of the assistant devices of the group.
    Type: Grant
    Filed: November 13, 2020
    Date of Patent: September 5, 2023
    Assignee: GOOGLE LLC
    Inventors: Matthew Sharifi, Victor Carbune
  • Patent number: 11748660
    Abstract: Implementations relate to an automated assistant that can automate repeatedly performed procedures. The automation can involve communicating with different users, organizations, and/or other automated assistants. The automated assistant, with prior permission from respective user(s), can detect repeated performance of a particular series of manually initiated computational actions. Based on this determination, the automated assistant can determine automated assistant computational action(s) that can be performed by the automated assistant in order to reduce latency in performing a procedure, reduce quantity and/or size of transmissions in performing the procedure, and/or reduce an amount of client device resources required for performing the procedure. Such actions can include communicating with an additional automated assistant that may be associated with another user and/or organization.
    Type: Grant
    Filed: September 22, 2020
    Date of Patent: September 5, 2023
    Assignee: GOOGLE LLC
    Inventors: Matthew Sharifi, Victor Carbune
  • Publication number: 20230274742
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for obtaining, for each of multiple words or sub-words, audio data corresponding to multiple users speaking the word or sub-word; training, for each of the multiple words or sub-words, a pre-computed hotword model for the word or sub-word based on the audio data for the word or sub-word; receiving a candidate hotword from a computing device; identifying one or more pre-computed hotword models that correspond to the candidate hotword; and providing the identified, pre-computed hotword models to the computing device.
    Type: Application
    Filed: May 8, 2023
    Publication date: August 31, 2023
    Applicant: Google LLC
    Inventor: Matthew Sharifi
  • Patent number: 11741944
    Abstract: A method of training a speech model includes receiving, at a voice-enabled device, a fixed set of training utterances where each training utterance in the fixed set of training utterances includes a transcription paired with a speech representation of the corresponding training utterance. The method also includes sampling noisy audio data from an environment of the voice-enabled device. For each training utterance in the fixed set of training utterances, the method further includes augmenting, using the noisy audio data sampled from the environment of the voice-enabled device, the speech representation of the corresponding training utterance to generate noisy audio samples and pairing each of the noisy audio samples with the corresponding transcription of the corresponding training utterance. The method additionally includes training a speech model on the noisy audio samples generated for each speech representation in the fixed set of training utterances.
    Type: Grant
    Filed: November 24, 2020
    Date of Patent: August 29, 2023
    Assignee: Google LLC
    Inventors: Matthew Sharifi, Victor Carbune
  • Publication number: 20230267911
    Abstract: In some implementations, a language proficiency of a user of a client device is determined by one or more computers. The one or more computers then determines a text segment for output by a text-to-speech module based on the determined language proficiency of the user. After determining the text segment for output, the one or more computers generates audio data including a synthesized utterance of the text segment. The audio data including the synthesized utterance of the text segment is then provided to the client device for output.
    Type: Application
    Filed: April 28, 2023
    Publication date: August 24, 2023
    Applicant: Google LLC
    Inventors: Matthew Sharifi, Jakob Nicolaus Foerster
  • Patent number: 11734287
    Abstract: Methods, systems, and apparatus for receiving a query image, receiving one or more entities that are associated with the query image, identifying, for one or more of the entities, one or more candidate search queries that are pre-associated with the one or more entities, generating a respective relevance score for each of the candidate search queries, selecting, as a representative search query for the query image, a particular candidate search query based at least on the generated respective relevance scores and providing the representative search query for output in response to receiving the query image.
    Type: Grant
    Filed: February 21, 2022
    Date of Patent: August 22, 2023
    Assignee: GOOGLE LLC
    Inventors: Matthew Sharifi, David Petrou, Abhanshu Sharma
  • Patent number: 11727925
    Abstract: Techniques are described herein for cross-device data synchronization based on simultaneous hotword triggers.
    Type: Grant
    Filed: December 8, 2020
    Date of Patent: August 15, 2023
    Assignee: GOOGLE LLC
    Inventors: Matthew Sharifi, Victor Carbune
  • Publication number: 20230252995
    Abstract: Various implementations include determining whether further spoken input is intended to correct at least one word in a candidate text representation of spoken input. Various implementations include receiving audio data capturing spoken input of a user. Various implementations include rendering output based on the candidate text representation to the user. Various implementations include receiving, while the output is being rendered, further audio data capturing the further spoken input. In response to determining the further spoken input is intended to correct the at least one word in the candidate text representation, various implementations include generating a revised text representation of the spoken input by altering at least one word in the candidate text representation based on one or more terms in the further candidate text representation.
    Type: Application
    Filed: February 8, 2022
    Publication date: August 10, 2023
    Inventors: Matthew Sharifi, Victor Carbune, Bogdan Prisacari, Alexander Froemmgen, Milosz Kmieciak, Felix Weissenberger, Daniel Valcarce
  • Publication number: 20230251877
    Abstract: Automated content switching rules may be generated and/or utilized for automatically switching away from certain interactive content during presentation of that interactive content when one or more switch conditions are met. In some instances, automated content switching rules may define one or more non-temporal switch conditions, e.g., based upon reaching certain points or milestones in certain interactive content, that may be used to initiate actions that switch away from the interactive content. In addition, in some instances, automated content switching rules may be used to not only switch away from particular interactive content, but additionally switch to other interactive content, thereby enabling a user to effectively schedule a workflow across different interactive content, applications and/or other computer-related tasks.
    Type: Application
    Filed: February 7, 2022
    Publication date: August 10, 2023
    Inventors: Victor Carbune, Matthew Sharifi
  • Patent number: 11722731
    Abstract: While an assistant-enabled device is playing back media content, a method includes receiving a contextual signal from an environment of the assistant-enabled device and executing an event recognition routine to determine whether the received contextual signal is indicative of an event that conflicts with the playback of the media content from the assistant-enabled device. When the event recognition routine determines that the received contextual signal is indicative of the event that conflicts with the playback of the media content, the method also includes adjusting content playback settings of the assistant-enabled device.
    Type: Grant
    Filed: November 24, 2020
    Date of Patent: August 8, 2023
    Assignee: Google LLC
    Inventors: Victor Carbune, Matthew Sharifi