Voice Recognition Patents (Class 704/246)
  • Patent number: 11972766
    Abstract: Techniques are described herein for detecting and suppressing commands in media that may trigger another automated assistant. A method includes: determining, for each of a plurality of automated assistant devices in an environment that are each executing at least one automated assistant, an active capability of the automated assistant device; initiating playback of digital media by an automated assistant; in response to initiating playback, processing the digital media to identify an audio segment in the digital media that, upon playback, is expected to trigger activation of at least one automated assistant executing on at least one of the plurality of automated assistant devices in the environment, based on the active capability of the at least one of the plurality of automated assistant devices; and in response to identifying the audio segment in the digital media, modifying the digital media to suppress the activation of the at least one automated assistant.
    Type: Grant
    Filed: January 23, 2023
    Date of Patent: April 30, 2024
    Assignee: GOOGLE LLC
    Inventors: Matthew Sharifi, Victor Carbune
  • Patent number: 11961095
    Abstract: Systems and methods generate a risk score for an account event. The systems and methods automatically generate a causal model corresponding to a user, wherein the model estimates components of the causal model using event parameters of a previous event undertaken by the user in an account of the user. The systems and methods predict expected behavior of the user during a next event in the account using the causal model. Predicting the expected behavior of the user includes generating expected event parameters of the next event. The systems and methods use a predictive fraud model to generate fraud event parameters. Generation of the fraud event parameters assumes a fraudster is conducting the next event, wherein the fraudster is any person other than the user. The systems and methods generate a risk score of the next event to indicate the relative likelihood the future event is performed by the user.
    Type: Grant
    Filed: June 2, 2021
    Date of Patent: April 16, 2024
    Assignee: GUARDIAN ANALYTICS, INC.
    Inventor: Tom Miltonberger
  • Patent number: 11942095
    Abstract: A computer-implemented method that includes receiving audio data corresponding to an utterance of a voice command captured by a user device. The user device has a plurality of different users. The method includes determining a particular user among the plurality of different users of the user device as a speaker of the utterance based on a comparison between the audio data and corresponding speaker verification data stored on memory hardware for each user of the plurality of different users of the user device. The method further includes, based on determining the particular user among the plurality of different users of the user device as the speaker of the utterance, providing, for output from the user device, a message comprising a speaker identifier associated with the particular user.
    Type: Grant
    Filed: May 1, 2023
    Date of Patent: March 26, 2024
    Assignee: Google LLC
    Inventors: Raziel Alvarez Guevara, Othar Hansson
  • Patent number: 11929078
    Abstract: Certain embodiments of the present disclosure provide techniques training a user detection model to identify a user of a software application based on voice recognition. The method generally includes receiving a data set including a plurality of voice interactions with users of a software application. For each respective recording in the data set, a spectrogram representation is generated based on the respective recording. A plurality of voice recognition models are trained. Each of the plurality of voice recognition models is trained based on the spectrogram representation for each of the plurality of voice recordings in the data set. The plurality of voice recognition models are deployed to an interactive voice response system.
    Type: Grant
    Filed: February 23, 2021
    Date of Patent: March 12, 2024
    Assignee: Intuit, Inc.
    Inventors: Shanshan Tuo, Divya Beeram, Meng Chen, Neo Yuchen, Wan Yu Zhang, Nivethitha Kumar, Kavita Sundar, Tomer Tal
  • Patent number: 11924254
    Abstract: This relates to intelligent automated assistants and, more specifically, to intelligent context sharing and task performance among a collection of devices with intelligent automated assistant capabilities. An example method includes, at a first electronic device participating in a context-sharing group associated with a first location: receiving a user voice input; receiving, from a context collector, an aggregate context of the context-sharing group; providing at least a portion of the aggregate context and data corresponding to the user voice input to a remote device; receiving, from the remote device, a command to perform one or more tasks and a device identifier corresponding to a second electronic device; and transmitting the command to the second electronic device based on the device identifier, wherein the command causes the second electronic device to perform the one or more tasks.
    Type: Grant
    Filed: May 3, 2021
    Date of Patent: March 5, 2024
    Assignee: Apple Inc.
    Inventors: Bryan Hansen, Nikrouz Ghotbi, Yifeng Gui, Xinyuan Huang, Benjamin S. Phipps, Eugene Ray, Mahesh Ramaray Shanbhag, Jaireh Tecarro, Sumit Wattal
  • Patent number: 11922952
    Abstract: Implementations set forth herein relate to an automated assistant that can be customized by a user to provide custom assistant responses to certain assistant queries, which may originate from other users. The user can establish certain custom assistant responses by providing an assistant response request to the automated assistant and/or responding to a request from the automated assistant to establish a particular custom assistant response. In some instances, a user can elect to establish a custom assistant response when the user determines or acknowledges that certain common queries are being submitted to the automated assistant—but the automated assistant is unable to resolve the common query. Establishing such custom assistant responses can therefore condense interactions between other users and the automated assistant.
    Type: Grant
    Filed: February 6, 2023
    Date of Patent: March 5, 2024
    Assignee: GOOGLE LLC
    Inventors: Victor Carbune, Matthew Sharifi
  • Patent number: 11902759
    Abstract: The present disclosure provides systems and methods for audio signal generation. The method may include obtaining first audio data collected by a bone conduction sensor; and obtaining second audio data collected by an air conduction sensor, the first audio data and the second audio data representing a speech of a user, with differing frequency component. The method may also include generating, based on the first audio data and the second audio data, third audio data, wherein frequency components of the third audio data higher than a frequency point increase with respect to frequency components of the first audio data higher than the first frequency point. In some embodiments, the method may further include determining, based on the third audio data, target audio data representing the speech of the user with better fidelity than the first audio data and the second audio data.
    Type: Grant
    Filed: January 29, 2022
    Date of Patent: February 13, 2024
    Assignee: SHENZHEN SHOKZ CO., LTD.
    Inventors: Meilin Zhou, Fengyun Liao, Xin Qi
  • Patent number: 11900919
    Abstract: An audio-visual automated speech recognition model for transcribing speech from audio-visual data includes an encoder frontend and a decoder. The encoder includes an attention mechanism configured to receive an audio track of the audio-visual data and a video portion of the audio-visual data. The video portion of the audio-visual data includes a plurality of video face tracks each associated with a face of a respective person. For each video face track of the plurality of video face tracks, the attention mechanism is configured to determine a confidence score indicating a likelihood that the face of the respective person associated with the video face track includes a speaking face of the audio track. The decoder is configured to process the audio track and the video face track of the plurality of video face tracks associated with the highest confidence score to determine a speech recognition result of the audio track.
    Type: Grant
    Filed: March 21, 2023
    Date of Patent: February 13, 2024
    Assignee: Google LLC
    Inventor: Otavio Braga
  • Patent number: 11887618
    Abstract: A call audio mixing processing method is provided. In the method, call audio streams from terminals of call members participating in a call are obtained. Voice analysis is performed on the call audio streams to determine voice activity corresponding to each of the terminals. The voice activity of the terminals indicate activity levels of the call members participating in the call. According to the voice activity of the terminals, respective voice adjustment parameters corresponding to the terminals are determined. According to the respective voice adjustment parameters corresponding to the terminals, the call audio streams of the terminals are adjusted. Further, mixing processing is performed on the adjusted call audio streams to obtain a mixed audio stream.
    Type: Grant
    Filed: April 18, 2022
    Date of Patent: January 30, 2024
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventor: Junbin Liang
  • Patent number: 11887587
    Abstract: An apparatus for processing an audio input recording to obtain a processed audio recording according to an embodiment is provided. The apparatus comprises an input interface (110) for receiving a plurality of audio input portions of the audio input recording. Moreover, the apparatus comprises a processor (120) for processing a plurality of audio input portions of the audio input recording to obtain a processed audio recording. The processor (120) is configured to determine, whether or not an audio input portion of the plurality of audio input portions comprises speech. If the processor (120) has detected that the audio input portion comprises speech, the processor (120) is configured to generate the processed audio recording by modifying the audio input portion to obtain a modified audio portion, and by generating the processed audio recording such that the processed audio recording comprises the modified audio portion instead of the audio input portion.
    Type: Grant
    Filed: April 14, 2021
    Date of Patent: January 30, 2024
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Jan Rennies-Hochmuth, Danilo Hollosi, Christian Rollwage, Jens-Ekkehart Appell
  • Patent number: 11831409
    Abstract: A system, apparatus, method, and machine readable medium are described for binding verifiable claims. For example, one embodiment of a system comprises: a client device; an authenticator of the client device to securely store authentication data including one or more verifiable claims received from one or more claim providers, each verifiable claim having attributes associated therewith; and claim/attribute processing logic to generate a first verifiable claim binding for a first verifiable claim issued by the claim provider; wherein the authenticator is to transmit a first signature assertion to a first relying party to authenticate with the first relying party, the first signature assertion including an attribute extension containing data associated with the first verifiable claim binding.
    Type: Grant
    Filed: January 10, 2019
    Date of Patent: November 28, 2023
    Assignee: NOK NOK LABS, INC.
    Inventor: Rolf Lindemann
  • Patent number: 11823193
    Abstract: Methods, computer program products, systems are provided for securely performing electronic transactions over a network. An acoustic wave signal is received. The acoustic wave signal is encoded with at least an identifier of a payer of a transaction and transaction information of the transaction. The identifier of the payer and the transaction information is then obtained from the acoustic wave signal. A bone conduction characteristic of the payer is retrieved based on the identifier of the payer and based on the received bone conduction characteristic of the payer, the payer is verified. Completion of the transaction is allowed based on the transaction information in response to the payer being verified.
    Type: Grant
    Filed: July 9, 2018
    Date of Patent: November 21, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Ting Yin, Dong Chen, Ting Ting BJ Zhan, Xiang Juan Meng, Yin Xia
  • Patent number: 11790920
    Abstract: Playback devices comprising a network interface, an optional speaker(s), and one or more processors are disclosed herein. In some embodiments, the playback device is configured to communicate with a computing system that stores configuration data corresponding to each of a plurality of users. The playback device detects one or more users near the playback device and retrieves user configuration data corresponding to each of the one or more detected users, and thereafter, uses the user configuration data of the one or more detected users to process voice commands, play media content, and/or perform other voice and/or media related functions.
    Type: Grant
    Filed: July 18, 2022
    Date of Patent: October 17, 2023
    Assignee: Sonos, Inc.
    Inventor: Paul Andrew Bates
  • Patent number: 11783824
    Abstract: A speech-processing system may provide access to one or more virtual assistants via an audio-controlled device. A user may leverage a first virtual assistant to translate a natural language command from a first language into a second language, which the device can send to a second virtual assistant for processing. The device may receive a command from a user and send input data representing the command to a first speech-processing system representing the first virtual assistant. The device may receive a response in the form of a first natural language output from the first speech-processing system along with an indication that the first natural language output should be directed to a second speech-processing system representing the second virtual assistant. For example, the command may be in the first language, and the first natural language output may be in the second language, which is understandable by the second speech-processing system.
    Type: Grant
    Filed: February 5, 2021
    Date of Patent: October 10, 2023
    Assignee: Amazon Technologies, Inc.
    Inventor: Robert John Mars
  • Patent number: 11769511
    Abstract: Systems and methods for audio processing include capturing first sound data via at least one microphone of a network microphone device (NMD) and determining, via a voice activity detection process, that the first sound data does not include voice activity. The first sound data is stored in a buffer, and the NMD forgoes spatial processing of the first sound data. The NMD can capture second sound data and determine, via the voice activity process, that the second sound data includes voice activity. The NMD spatially processes the sound data to produce filtered sound data. The NMD detects a wake word based on data in the buffer. After detecting the wake word, the NMD may determine an action to be performed based on the data in the buffer.
    Type: Grant
    Filed: November 30, 2022
    Date of Patent: September 26, 2023
    Assignee: Sonos, Inc.
    Inventors: Aaron Jones, Saeed Bagheri Sereshki, Daniele Giacobello
  • Patent number: 11756536
    Abstract: [Object] To provide a highly accurate voice analysis system. [Solution] A voice analysis system 1 includes a first voice analysis terminal 3 and a second voice analysis terminal 5. The first voice analysis terminal 3 includes a first term analysis unit 7 that obtains first conversation information, a first conversation storage unit 9 that stores the first conversation information, a first analysis unit 11 that analyzes the first conversation information, a presentation storage unit 13, a related term storage unit 15, a display unit 17, a topic word storage unit 19, and a conversation information reception unit 25 that receives second conversation information from the second voice analysis terminal 5. The second voice analysis terminal 5 includes a second term analysis unit 21 that obtains the second conversation information and a second conversation storage unit 23.
    Type: Grant
    Filed: December 15, 2020
    Date of Patent: September 12, 2023
    Assignee: Interactive Solutions Corp.
    Inventor: Kiyoshi Sekine
  • Patent number: 11756554
    Abstract: An attribute identification technology that can reject an attribute identification result if the reliability thereof is low is provided. An attribute identification device includes: a posteriori probability calculation unit 110 that calculates, from input speech, a posteriori probability sequence {q(c, i)} which is a sequence of the posteriori probabilities q(c, i) that a frame i of the input speech is a class c; a reliability calculation unit 120 that calculates, from the posteriori probability sequence {q(c, i)}, reliability r(c) indicating the extent to which the class c is a correct attribute identification result; and an attribute identification result generating unit 130 that generates an attribute identification result L of the input speech from the posteriori probability sequence {q(c, i)} and the reliability r(c).
    Type: Grant
    Filed: August 23, 2021
    Date of Patent: September 12, 2023
    Assignee: Nippon Telegraph and Telephone Corporation
    Inventors: Hosana Kamiyama, Satoshi Kobashikawa, Atsushi Ando
  • Patent number: 11748845
    Abstract: Systems, processes, and techniques to automatically detect and enlarge a speaking one of plurality of participants on one side of a video conference. In at least one embodiment, the speaking participant is identified using one or more heuristics and/or one or more neural networks.
    Type: Grant
    Filed: January 27, 2021
    Date of Patent: September 5, 2023
    Assignee: NVIDIA Corporation
    Inventors: Akarsh Umesh Zingade, Jianyuan Min, Shuye Han, Rochelle Pereira
  • Patent number: 11735188
    Abstract: A system and method may identify a fraud ring based on call or interaction data by analyzing by a computer processor interaction data including audio recordings to identify clusters of interactions which are suspected of involving fraud each cluster including the same speaker; analyzing by the computer processor the clusters, in combination with metadata associated with the interaction data, to identify fraud rings, each fraud ring describing a plurality of different speakers, each fraud ring defined by a set of speakers and a set of metadata corresponding to interactions including that speaker; and for each fraud ring, creating a relevance value defining the relative relevance of the fraud ring.
    Type: Grant
    Filed: September 12, 2022
    Date of Patent: August 22, 2023
    Assignee: Nice Ltd.
    Inventors: Matan Keret, Anat Malin, Natan Katz, Shunit Metz, Sigal Lev, Jeremy Hoyland
  • Patent number: 11727128
    Abstract: A method, apparatus and computer program product are disclosed to provide for the selective establishment and use of secure communication channels to facilitate the exchange of data objects containing potentially sensitive information in a network environment. In some example implementations, upon detection that the processing of a network entity request implicates the exchange of non-public information amongst one or more other network entities, one or more secure communication channels are established between a secure transfer system and the relevant network entities such that non-public information neither passes to nor resides on system components associated with non-secure network entities.
    Type: Grant
    Filed: August 20, 2021
    Date of Patent: August 15, 2023
    Assignee: PAYMENTUS CORPORATION
    Inventor: Dushyant Sharma
  • Patent number: 11721342
    Abstract: A method comprising detecting an activation of an intelligent assistant on an electronic device, waking up the intelligent assistant from a sleep mode in response to the activation, and determining an amount of vocabulary the intelligent assistant acts upon during a listening mode based on a type of the activation.
    Type: Grant
    Filed: September 22, 2022
    Date of Patent: August 8, 2023
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Jeffrey C. Olson, Henry N. Holtzman, Jean-David Hsu, Jeffrey A. Morgan
  • Patent number: 11720622
    Abstract: Machine learning multiple features of an item depicted in images. Upon accessing multiple images that depict the item, a neural network is used to machine train on the plurality of images to generate embedding vectors for each of multiple features of the item. For each of multiple features of the item depicted in the images, in each iteration of the machine learning, the embedding vector is converted into a probability vector that represents probabilities that the feature has respective values. That probability vector is then compared with a value vector representing the actual value of that feature in the depicted item, and an error between the two vectors is determined. That error is used to adjust parameters of the neural network used to generate the embedding vector, allowing for the next iteration in the generation of the embedding vectors. These iterative changes continue thereby training the neural network.
    Type: Grant
    Filed: June 9, 2022
    Date of Patent: August 8, 2023
    Inventors: Oren Barkan, Noam Razin, Noam Koenigstein, Roy Hirsch, Nir Nice
  • Patent number: 11721326
    Abstract: In some implementations, processor(s) can receive an utterance from a speaker, and determine whether the speaker is a known user of a user device or not a known user of the user device. The user device can be shared by a plurality of known users. Further, the processor(s) can determine whether the utterance corresponds to a personal request or non-personal request. Moreover, and in response to determining that the speaker is not a known user of the user device and in response to determining that the utterance corresponds to a non-personal request, the processor(s) can cause a response to the utterance to be provided for presentation to the speaker at the user device response to the utterance, or can cause an action to be performed by the user device responsive to the utterance.
    Type: Grant
    Filed: January 26, 2022
    Date of Patent: August 8, 2023
    Assignee: GOOGLE LLC
    Inventors: Meltem Oktem, Taral Pradeep Joglekar, Fnu Heryandi, Pu-sen Chao, Ignacio Lopez Moreno, Salil Rajadhyaksha, Alexander H. Gruenstein, Diego Melendo Casado
  • Patent number: 11699441
    Abstract: The present disclosure describes techniques for dynamically determining when information is to be output to a user, as well as what information is to be output to a user. A natural language processing system may receive, from a first device, first data representing information to be output at a first point during a skill session. The natural language processing system may also receive, from a second device, second data representing a natural language input. The natural language processing system may determine a skill component is to execute with respect to the natural language input. The natural language processing system may send, to the skill component, second data representing the natural language input. The natural language processing system may receive, from the skill component, an indication that an ongoing first skill session with the second device has reached the first point.
    Type: Grant
    Filed: January 11, 2022
    Date of Patent: July 11, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Mark Conrad Kockerbeck, Muhammad Yahia, Jordan Michael Hughes, Kevin Boehm, Rohit Sauhta
  • Patent number: 11675885
    Abstract: In a system and method for audio analysis in a cloud-based computerized an authentication (RTA) manager micro-service may send an audio packet to a voice processor micro-service. The voice processor may extract features of the audio. The RTA manager may obtain the extracted features from the voice processor; calculate, based on the extracted features, a quality grade of the audio packet, and send the extracted features to an at least one voice biometrics engine if the quality grade is above a threshold. Each of the at least one voice biometrics engines may be configured to generate a voiceprint of the audio packet, based on the extracted features of the audio packet and to perform at least one of: authenticate a speaker, detect fraudsters, and enrich a previously stored voiceprint of the speaker with the voiceprint of the audio packet.
    Type: Grant
    Filed: September 27, 2021
    Date of Patent: June 13, 2023
    Assignee: Nice Ltd.
    Inventors: Matan Keret, William Mark Finlay, Peter S Cardillo
  • Patent number: 11676608
    Abstract: A method includes generating an audio signal encoding an utterance captured by a microphone of a user device and transmitting the audio signal encoding the utterance to a server. The server is configured to determine a speaker of the utterance from one of a plurality of different users of the user device based on a comparison between the audio signal encoding the utterance and corresponding speaker verification data, and process the audio signal encoding the utterance using a speech recognition module to identify a particular action. The method also includes executing the particular action identified by the server to cause a particular application to launch on the user device based on user permissions associated with the speaker determined by the server to access the particular data.
    Type: Grant
    Filed: April 2, 2021
    Date of Patent: June 13, 2023
    Assignee: Google LLC
    Inventors: Raziel Alvarez Guevara, Othar Hansson
  • Patent number: 11670302
    Abstract: An electronic device and method are disclosed herein. The electronic device includes a network interface and processor. The processor implements the method, including receiving a voice input through a network interface as transmitted from a first external device, including a request to execute a function using at least one application which is not indicated in the voice input, extracting a first text from the voice input by executing automatic speech recognition (ASR), when the at least one application is identified based on the first text, transmitting, through the network interface to the first external device, second data associated with the identified at least one application for display by the first external device, and when the at least one application is not identified based at least in part on the first text, reattempting identification of the at least one application by executing natural language understanding (NLU) on the first text.
    Type: Grant
    Filed: November 13, 2020
    Date of Patent: June 6, 2023
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Joo Hyuk Jeon, Woo Up Kwon, Jin Woo Park, Kyoung Gu Woo, Eun Taek Lim, Kyung Hak Hyun, Dong Ho Jang
  • Patent number: 11662797
    Abstract: This application relates to techniques that adjust the sleep states of a computing device based on proximity detection and predicted user activity. Proximity detection procedures can be used to determine a proximity between the computing device and a remote computing device coupled to the user. Based on these proximity detection procedures, the computing device can either correspondingly increase or decrease the amount power supplied to the various components during either a low-power sleep state or a high-power sleep state. Additionally, historical user activity data gathered on the computing device can be used to predict when the user will likely use the computing device. Based on the gathered historical user activity, deep sleep signals and light sleep signals can be issued at a time when the computing device is placed within a sleep state which can cause it to enter either a low-power sleep state or a high-power sleep state.
    Type: Grant
    Filed: February 17, 2022
    Date of Patent: May 30, 2023
    Inventors: Varaprasad V. Lingutla, Kartik R. Venkatraman, Marc J. Krochmal
  • Patent number: 11657801
    Abstract: Methods, systems, and apparatuses for predicting an end of a command in a voice recognition input are described herein. The system may receive data comprising a voice input. The system may receive a signal comprising a voice input. The system may detect, in the voice input, data that is associated with a first portion of a command. The system may predict, based on the first portion and while the voice input is being received, a second portion of the command. The prediction may be generated by a machine learning algorithm that is trained based at least in part on historical data comprising user input data. The system may cause execution of the command, based on the first portion and the predicted second portion, prior to an end of the voice input.
    Type: Grant
    Filed: February 27, 2019
    Date of Patent: May 23, 2023
    Assignee: Comcast Cable Communications, LLC
    Inventors: Rui Min, Hongcheng Wang
  • Patent number: 11651772
    Abstract: A system and method for improving the performance of a hands-free voice user interface system while minimizing the computational complexity without sacrificing performance. Specifically, when estimating the location of the talker for the purpose of steering a directional beam in the direction of the active talker. A hands-free voice user interface system requires a clean signal to be streamed to the cloud for recognition. One way to improve the speech signal is to estimate where the talker is and steer a beam in the direction of the active talker. To locate the talker to a localized position, a direction of arrival estimator (DOA) algorithm is used. DoA generally requires noise and echo free signal for optimal estimation, but it is computationally expensive to run audio pre-processing such as an acoustic echo cancellation for each microphone in microphone array.
    Type: Grant
    Filed: January 28, 2022
    Date of Patent: May 16, 2023
    Assignee: DSP Concepts, Inc.
    Inventors: Ke Li, Paul Beckmann
  • Patent number: 11646012
    Abstract: An electronic apparatus which registers a device to a server by using a voice, and a method therefor are provided. The electronic apparatus includes a communication circuit, a microphone, a memory for storing computer executable instructions, and at least one processor configured to execute the computer executable instructions to acquire, from a voice received through the microphone, information on an external device which a user wishes to register, based on an external device corresponding to the acquired information being searched through the communication circuit, control the communication circuit to transmit information on an access point to the external device to enable the external device to communicate with a server, and control the communication circuit to transmit a registration request with respect to the external device to the server.
    Type: Grant
    Filed: August 19, 2022
    Date of Patent: May 9, 2023
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Taejun Kwon, Seongil Hahm, Seungsoo Kang
  • Patent number: 11646037
    Abstract: Systems, methods, and non-transitory computer-readable media can provide audio waveform data that corresponds to a voice sample to a temporal convolutional network for evaluation. The temporal convolutional network can pre-process the audio waveform data and can output an identity embedding associated with the audio waveform data. The identity embedding associated with the voice sample can be obtained from the temporal convolutional network. Information describing a speaker associated with the voice sample can be determined based at least in part on the identity embedding.
    Type: Grant
    Filed: December 8, 2020
    Date of Patent: May 9, 2023
    Assignee: OTO Systems Inc.
    Inventors: Valentin Alain Jean Perret, Nicolas Lucien Perony, Nándor Kedves
  • Patent number: 11646020
    Abstract: A method for managing electronic communication notifications includes responsive to receiving a communication from a first user, identifying one or more keywords in the communication based on a plurality of keywords associated with a plurality of queries previously presented by a second user. Determining whether the communication includes a reply to a first open query, wherein the first open query represents a question previously presented by the second user directed to the first user. Responsive to determining the communication from the first user includes the reply to the first open query, notifying the second user utilizing a first alert type for the communication from the first user that includes the reply for the first open query, wherein the first alert type is different from a second alert type for notifying the second user regarding a communication that does not include the reply for the first open query.
    Type: Grant
    Filed: January 24, 2020
    Date of Patent: May 9, 2023
    Assignee: International Business Machines Corporation
    Inventors: Priyansh Jaiswal, Peeyush Jaiswal
  • Patent number: 11631402
    Abstract: A method of detecting a replay attack comprises: receiving an audio signal representing speech; identifying speech content present in at least a portion of the audio signal; obtaining information about a frequency spectrum of each portion of the audio signal for which speech content is identified; and, for each portion of the audio signal for which speech content is identified: retrieving information about an expected frequency spectrum of the audio signal; comparing the frequency spectrum of portions of the audio signal for which speech content is identified with the respective expected frequency spectrum; and determining that the audio signal may result from a replay attack if a measure of a difference between the frequency spectrum of the portions of the audio signal for which speech content is identified and the respective expected frequency spectrum exceeds a threshold level.
    Type: Grant
    Filed: May 7, 2020
    Date of Patent: April 18, 2023
    Assignee: Cirrus Logic, Inc.
    Inventors: John Paul Lesso, César Alonso
  • Patent number: 11622271
    Abstract: Aspects of the present disclosure include methods, apparatuses, and computer readable media for controlling access including generating a random string or pseudorandom string, acoustically broadcasting a beacon message comprising the random string or pseudorandom string, acoustically receiving, in response to acoustically broadcasting the beacon message, an authentication message comprising a user identification and an authentication string, obtaining a password associated with the user identification, computing a verification string using the password and the random string or pseudorandom string, verifying the authentication string in the authentication message using the verification string, and transmitting, in response to successfully verifying the authentication string in the authentication message, an unlocking message to the access controlled point to unlock the access controlled point.
    Type: Grant
    Filed: February 11, 2020
    Date of Patent: April 4, 2023
    Assignee: Johnson Controls Tyco IP Holdings LLP
    Inventor: Rolando Herrero
  • Patent number: 11609947
    Abstract: A device may be configured to determine whether an audio file is a first type of audio file that is capable of being processed to recognize the voice query based on a characteristic of the audio file itself or a second type of audio file that may require speech recognition processing in order to recognize the voice query associated with the audio file. In determining whether the audio file is a first type of audio file or a second type of audio file, a query filter associated with the device may be configured to access one or more guidance queries. Using the one or more guidance queries, the device may classify the audio file as a first type of audio file or a second type of audio file based on receiving only a portion of the audio file, thereby improving the speed at which the audio file can be processed.
    Type: Grant
    Filed: October 21, 2019
    Date of Patent: March 21, 2023
    Assignee: Comcast Cable Communications, LLC
    Inventors: Rui Min, Hongcheng Wang
  • Patent number: 11605371
    Abstract: Embodiments of the present systems and methods may provide techniques for synthesizing speech in any voice in any language in any accent. For example, in an embodiment, a text-to-speech conversion system may comprise a text converter adapted to convert input text to at least one phoneme selected from a plurality of phonemes stored in memory, a machine-learning model storing voice patterns for a plurality of individuals and adapted to receive the at least one phoneme and an identity of a speaker and to generate acoustic features for each phoneme, and a decoder adapted to receive the generated acoustic features and to generate a speech signal simulating a voice of the identified speaker in a language.
    Type: Grant
    Filed: June 14, 2019
    Date of Patent: March 14, 2023
    Assignee: Georgetown University
    Inventors: Joe Garman, Ophir Frieder
  • Patent number: 11605388
    Abstract: This specification describes a computer-implemented method of generating speech audio for use in a video game, wherein the speech audio is generated using a voice convertor that has been trained to convert audio data for a source speaker into audio data for a target speaker. The method comprises receiving: (i) source speech audio, and (ii) a target speaker identifier. The source speech audio comprises speech content in the voice of a source speaker. Source acoustic features are determined for the source speech audio. A target speaker embedding associated with the target speaker identifier is generated as output of a speaker encoder of the voice convertor. The target speaker embedding and the source acoustic features are inputted into an acoustic feature encoder of the voice convertor. One or more acoustic feature encodings are generated as output of the acoustic feature encoder. The one or more acoustic feature encodings are derived from the target speaker embedding and the source acoustic features.
    Type: Grant
    Filed: November 9, 2020
    Date of Patent: March 14, 2023
    Assignee: Electronic Arts Inc.
    Inventors: Kilol Gupta, Dhaval Shah, Zahra Shakeri, Jervis Pinto, Mohsen Sardari, Harold Chaput, Navid Aghdaie, Kazi Zaman
  • Patent number: 11580501
    Abstract: A method and device for automatic meeting detection and analysis. A mobile electronic device includes multiple sensors configured to selectively capture sensor data. A classifier is configured to analyze the sensor data to detect a meeting zone for a meeting with multiple participants. A processor device is configured to control the multiple sensors and the classifier to trigger sensor data capture.
    Type: Grant
    Filed: December 8, 2015
    Date of Patent: February 14, 2023
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Kiran K. Rachuri, Jun Yang, Enamul Hoque, Evan Welbourne
  • Patent number: 11580599
    Abstract: In some examples, a computing device receives, from a point of sale (POS) device of a merchant, an authorization request for authorizing a payment instrument for a transaction for an item. The computing device sends an authorization approval to the POS device, and determines that a user is eligible for a loan for an amount and the loan is to be repaid at a particular frequency during a period of time. The computing device determines that an amount of the transaction is less than or equal to the amount, and causes an offer for the loan to be presented via a user interface of a payment application that is executable by a user device of the user. Based on acceptance of the loan, the computing device applies the loan funds to the transaction for the item, and the loan is repaid at the particular frequency.
    Type: Grant
    Filed: March 25, 2022
    Date of Patent: February 14, 2023
    Assignee: BLOCK, INC.
    Inventors: Varun Kerof, Elliot Block, Kelvin Chou, Theodore Kosev
  • Patent number: 11574640
    Abstract: Implementations set forth herein relate to an automated assistant that can be customized by a user to provide custom assistant responses to certain assistant queries, which may originate from other users. The user can establish certain custom assistant responses by providing an assistant response request to the automated assistant and/or responding to a request from the automated assistant to establish a particular custom assistant response. In some instances, a user can elect to establish a custom assistant response when the user determines or acknowledges that certain common queries are being submitted to the automated assistant—but the automated assistant is unable to resolve the common query. Establishing such custom assistant responses can therefore condense interactions between other users and the automated assistant.
    Type: Grant
    Filed: July 13, 2020
    Date of Patent: February 7, 2023
    Assignee: Google LLC
    Inventors: Victor Carbune, Matthew Sharifi
  • Patent number: 11574637
    Abstract: Techniques for using a federated learning framework to update machine learning models for spoken language understanding (SLU) system are described. The system determines which labeled data is needed to update the models based on the models generating an undesired response to an input. The system identifies users to solicit labeled data from, and sends a request to a user device to speak an input. The device generates labeled data using the spoken input, and updates the on-device models using the spoken input and the labeled data. The updated model data is provided to the system to enable the system to update the system-level (global) models.
    Type: Grant
    Filed: September 8, 2020
    Date of Patent: February 7, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Anoop Kumar, Anil K Ramakrishna, Sriram Venkatapathy, Rahul Gupta, Sankaranarayanan Ananthakrishnan, Premkumar Natarajan
  • Patent number: 11568879
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for verifying an identity of a user. The methods, systems, and apparatus include actions of receiving a request for a verification phrase for verifying an identity of a user. Additional actions include, in response to receiving the request for the verification phrase for verifying the identity of the user, identifying subwords to be included in the verification phrase and in response to identifying the subwords to be included in the verification phrase, obtaining a candidate phrase that includes at least some of the identified subwords as the verification phrase. Further actions include providing the verification phrase as a response to the request for the verification phrase for verifying the identity of the user.
    Type: Grant
    Filed: June 10, 2021
    Date of Patent: January 31, 2023
    Assignee: Google LLC
    Inventors: Dominik Roblek, Matthew Sharifi
  • Patent number: 11562748
    Abstract: Techniques are described herein for detecting and suppressing commands in media that may trigger another automated assistant. A method includes: determining, for each of a plurality of automated assistant devices in an environment that are each executing at least one automated assistant, an active capability of the automated assistant device; initiating playback of digital media by an automated assistant; in response to initiating playback, processing the digital media to identify an audio segment in the digital media that, upon playback, is expected to trigger activation of at least one automated assistant executing on at least one of the plurality of automated assistant devices in the environment, based on the active capability of the at least one of the plurality of automated assistant devices; and in response to identifying the audio segment in the digital media, modifying the digital media to suppress the activation of the at least one automated assistant.
    Type: Grant
    Filed: December 1, 2020
    Date of Patent: January 24, 2023
    Assignee: Google LLC
    Inventors: Matthew Sharifi, Victor Carbune
  • Patent number: 11550992
    Abstract: A non-transitory computer-readable storage medium may include instructions stored thereon for propagating changes to copied text. When executed by at least one processor, the instructions may be configured to cause a computing system to at least present copied text within a user interface of the computing system, monitor the user interface for changes to the copied text, receive a change to the copied text, the change including replacing a first instance of a first word, within the copied text, with a first instance of a second word, and in response to receiving the change to the copied text, present a prompt to replace, within the copied text, a second instance of the first word with a second instance of the second word.
    Type: Grant
    Filed: February 21, 2020
    Date of Patent: January 10, 2023
    Assignee: GOOGLE LLC
    Inventors: Harold H. W. Kim, Alessandro Suraci, Nakul Kumar, Pritam Pebam, Tali Rosen Shoham, Arkady Zaifman
  • Patent number: 11551700
    Abstract: Systems and methods for audio processing include capturing sound data via at least one microphone of a network microphone device (NMD) and determining whether the captured sound includes voice activity. While in a first stage, the NMD forgoes spatial processing of the captured sound data. If the NMD determines that the detected sound includes voice activity, the NMD transitions to a second stage. In this second stage, the NMD spatially processes the detected sound to produce filtered sound data and detects a wake word. After detecting the wake word, the NMD may determine an action to be performed based on the captured sound data.
    Type: Grant
    Filed: January 25, 2021
    Date of Patent: January 10, 2023
    Assignee: Sonos, Inc.
    Inventors: Aaron Jones, Saeed Bagheri Sereshki, Daniele Giacobello
  • Patent number: 11550542
    Abstract: An electronic device can implement a zero-latency digital assistant by capturing audio input from a microphone and using a first processor to write audio data representing the captured audio input to a memory buffer. In response to detecting a user input while capturing the audio input, the device can determine whether the user input meets a predetermined criteria. If the user input meets the criteria, the device can use a second processor to identify and execute a task based on at least a portion of the contents of the memory buffer.
    Type: Grant
    Filed: August 16, 2021
    Date of Patent: January 10, 2023
    Assignee: Apple Inc.
    Inventors: William F. Stasior, David A. Carson, Rohit Dasari, Yoon Kim
  • Patent number: 11544366
    Abstract: An information processing apparatus includes a processor configured to acquire a voice of a user, authenticate the user by using the voice, and recognize the voice, and display operation screens that are different depending on an authentication result of the user and a recognition result of the voice and are used for an operation of executing processing on a display unit.
    Type: Grant
    Filed: June 23, 2020
    Date of Patent: January 3, 2023
    Assignee: FUJIFILM Business Innovation Corp.
    Inventors: Ayaka Shinkawa, Hideki Sato
  • Patent number: 11518398
    Abstract: According to an embodiment, an agent system includes: a plurality of agent functions mounted on a plurality of different objects and configured to each provide a service which includes a service for causing an output to output a response by a voice in response to a speech of a user; and an information provider configured to include attribute information associated with the same kind of agent function in response content by the same kind of agent function and provide the attribute information to a portable mobile terminal of the user when the same kind of agent function is in the plurality of objects among the plurality of agent functions.
    Type: Grant
    Filed: March 17, 2020
    Date of Patent: December 6, 2022
    Assignee: HONDA MOTOR CO., LTD.
    Inventors: Hiroshi Honda, Toshikatsu Kuramochi, Yusuke Oi, Mototsugu Kubota
  • Patent number: 11516264
    Abstract: A question-and-answer application with an “ask-to-answer” feature is described. The ask-to-answer feature enables any user to solicit an answer to a question from another user. Upon soliciting another user for an answer to a particular question, a message with a call to action is directed to the solicited user. The message may include a copy of the text of the question and may provide a mechanism (e.g., a selectable user interface element) enabling the solicited user to pass on answering the question. Subsequent to the solicitation, the question page for the question will include a notification with information about the solicitation, including in some instances information identifying the user who has been asked to answer the question and the number of times the user has been asked to provide an answer to the question.
    Type: Grant
    Filed: September 28, 2021
    Date of Patent: November 29, 2022
    Assignee: QUORA, INC.
    Inventors: Adam Edward D'Angelo, Charles Duplain Cheever, Kevin G. Der, Rebekah Marie Cox