Voice Recognition Patents (Class 704/246)

Preliminary matching (Class 704/247)

Endpoint detection (Class 704/248)

Subportions (Class 704/249)

Specialized models (Class 704/250)

AI control device, server device connected to AI control device, and AI control method

Patent number: 12266369

Abstract: An AI control device, which identifies individual users from a plurality of users to receive input data, and is connectable to a server device that generates a trained model based on input data for each user, includes a control unit, and a communication unit connected to the server device. The control unit acquires input data, associates acquired input data and identifying information used to identify the user of the AI control device, and sends the data and information to the server device via the communication unit. The control unit uses the sent acquired input data to execute a trained model that is generated separately from trained models of other users by the server device, and that learns characteristics of acquired input data and detects input data having the same characteristics from unknown input data.

Type: Grant

Filed: March 19, 2020

Date of Patent: April 1, 2025

Assignee: TOA Corporation

Inventor: Yuma Kawai
Wireless microphone system and methods for synchronizing a wireless transmitter and a wireless receiver

Patent number: 12204811

Abstract: The present disclosure describes a wireless microphone system that allows one or more microphones to wirelessly communicate with a receiver. Additionally, the wireless microphone system may allow for a plurality of microphones to be used interchangeably with the receiver. To ensure communication between the receiver and the one or more microphones, the receiver may occasionally transmit a synchronization signal to the one or more microphones. In response to receiving the synchronization signal, at least one of the one or more microphones may determine that a clock of the at least one microphone is drifting from the master audio clock of the receiver. The at least one microphone may then adjust the microphone's audio clock to re-synchronize the audio clock of the microphone with the master audio clock of the receiver.

Type: Grant

Filed: February 16, 2023

Date of Patent: January 21, 2025

Assignee: Shure Acquisition Holdings, Inc.

Inventors: Simon John Beavis, Scott Kuhn, Mike Handler, William Kevin Doss, Jack Wong, Christopher Frantisak
Method and apparatus for generating general voice commands and augmented reality display

Patent number: 12190882

Abstract: The present disclosure relates to a method and an apparatus for generating general voice commands, and the method includes: obtaining View tree content of a display interface of an application; traversing information nodes in the View tree content, and configuring different voice commands for different information nodes based on attributes of the information nodes; and aggregating all voice commands in the display interface, and mixing and filtering the commands to form a final voice command set.

Type: Grant

Filed: November 30, 2021

Date of Patent: January 7, 2025

Assignee: HANGZHOU LINGBAN TECHNOLOGY CO. LTD.

Inventor: Weiming Liu
Proximity based identity modulation for an identity verification system

Patent number: 12192197

Abstract: An identity verification system receives a candidate signal transmitted from a computing device located near a secured asset in response to a target user requesting access to the secure asset. The identity verification system compares the candidate signal to a cluster of signals measured during past successful authentications by the computing device to determine a modulation factor and determines a match probability for the target user based on the determined modulation factor. The identity verification system grants the requesting target user access to the secured asset in response to determining that the match probability is greater than the operational security threshold.

Type: Grant

Filed: December 3, 2021

Date of Patent: January 7, 2025

Assignee: TruU, Inc.

Inventors: Lucas Allen Budman, Amitabh Agrawal, Oleksandr Rodak, Andrew Weber Spott
Speaker recognition in the call center

Patent number: 12175983

Abstract: Utterances of at least two speakers in a speech signal may be distinguished and the associated speaker identified by use of diarization together with automatic speech recognition of identifying words and phrases commonly in the speech signal. The diarization process clusters turns of the conversation while recognized special form phrases and entity names identify the speakers. A trained probabilistic model deduces which entity name(s) correspond to the clusters.

Type: Grant

Filed: February 8, 2024

Date of Patent: December 24, 2024

Assignee: Pindrop Security, Inc.

Inventors: Ellie Khoury, Matthew Garland
User authentication, for assistant action, using data from other device(s) in a shared environment

Patent number: 12154576

Abstract: Implementations set forth herein relate to an automated assistant that can solicit other devices for data that can assist with user authentication. User authentication can be streamlined for certain requests by removing a requirement that all authentication be performed at a single device and/or by a single application. For instance, the automated assistant can rely on data from other devices, which can indicate a degree to which a user is predicted to be present at a location of an assistant-enabled device. The automated assistant can process this data to make a determination regarding whether the user should be authenticated in response to an assistant input and/or pre-emptively before the user provides an assistant input. In some implementations, the automated assistant can perform one or more factors of authentication and utilize the data to verify the user in lieu of performing one or more other factors of authentication.

Type: Grant

Filed: January 11, 2022

Date of Patent: November 26, 2024

Assignee: GOOGLE LLC

Inventors: Matthew Sharifi, Victor Carbune
Cross-device voiceprint recognition

Patent number: 12142271

Abstract: According to an embodiment, an electronic device is provided. The electronic device includes: at least one processor; and a memory comprising instructions, which when executed, control the at least one processor to: receive a voice instruction of a user at the electronic device; transmit information regarding the voice instruction to a control device for identifying the user by mapping to a first voiceprint which is registered by another electronic device, a second voiceprint of the voice instruction of the user based on a voiceprint mapping model; and perform an operation corresponding to the voice instruction upon the identification of the user.

Type: Grant

Filed: December 30, 2019

Date of Patent: November 12, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Yongchao Wu, Jie Chen
Voice recognition device and voice recognition method

Patent number: 12131737

Abstract: A voice recognition device receives requests to control devices installed in a moving body based on instructions voiced by a user. The voice recognition device includes a speech acquisition unit, a speech data conversion unit, a control target device identification unit, a detection mode setting unit and a control request identification unit. The speech acquisition unit acquires speech. The speech data conversion unit converts the acquired speech into speech data. The control target device identification unit that analyzes the speech data to identify the control target device. The detection mode setting unit that sets a detection mode for identifying the control request corresponding to the speech data in accordance with the control target device. The control request identification unit that analyzes the speech data to identify the control request with respect to the control target device, based on the set detection mode.

Type: Grant

Filed: March 19, 2020

Date of Patent: October 29, 2024

Assignee: Nissan Motor Co., Ltd.

Inventor: Mika Sugimoto
Determining device groups

Patent number: 12125483

Abstract: This disclosure describes, in part, techniques for determining device groupings, or clusters, for multiple voice-enabled devices. The device clusters may be determined based on metadata data for audio signals (or audio data) generated by each of the multiple voice-enabled devices. For example, a remote system may analyze timestamp data for the audio signals received from the devices, and determine that the devices detected the same voice command of a user based on the timestamp data indicating that the audio signals were received within a threshold period of time from each other. Additionally, the remote system may analyze other metadata of the audio data, such as signal-to-noise (SNR) values, and determine that the SNR values are within a threshold value. The remote system may determine device clusters for the voice-enabled devices of a user based on these, and potentially other, types of metadata of the audio signals.

Type: Grant

Filed: October 1, 2021

Date of Patent: October 22, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Venkata Snehith Cherukuri, Joseph White, Vinodth Kumar Mohanam, Rami Habal, Menghan Li
Voice interaction wakeup electronic device, method and medium based on mouth-covering action recognition

Patent number: 12112756

Abstract: An interaction method triggered by a mouth-covering gesture and an intelligent electronic device are provided. The interaction method is applied to an intelligent electronic device arranged with a sensor. The intelligent electronic device includes a sensor system for capturing a signal of a user putting one hand on a mouth to make a mouth-covering gesture. The interaction method includes: processing the signal to determine whether the user puts the hand on the mouth to make the mouth-covering gesture; and in a case that the user puts the hand on the mouth to make the mouth-covering gesture, determining a mouth-covering gesture input mode as an input mode for controlling an interaction to trigger a control command or trigger another input mode, by executing a program on the intelligent electronic device.

Type: Grant

Filed: May 26, 2020

Date of Patent: October 8, 2024

Assignee: TSINGHUA UNIVERSITY

Inventors: Chun Yu, Yuanchun Shi
Machine learning multiple features of depicted item

Patent number: 12093305

Abstract: Machine learning multiple features of an item depicted in images. Upon accessing multiple images that depict the item, a neural network is used to machine train on the plurality of images to generate embedding vectors for each of multiple features of the item. For each of multiple features of the item depicted in the images, in each iteration of the machine learning, the embedding vector is converted into a probability vector that represents probabilities that the feature has respective values. That probability vector is then compared with a value vector representing the actual value of that feature in the depicted item, and an error between the two vectors is determined. That error is used to adjust parameters of the neural network used to generate the embedding vector, allowing for the next iteration in the generation of the embedding vectors. These iterative changes continue thereby training the neural network.

Type: Grant

Filed: June 19, 2023

Date of Patent: September 17, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Oren Barkan, Noam Razin, Noam Koenigstein, Roy Hirsch, Nir Nice
Vehicle control system and vehicle control method

Patent number: 12087295

Abstract: A vehicle control system includes a portable terminal and a vehicle on-board device that carries out wireless communication with the portable terminal. The vehicle on-board device includes a first voice recognition unit that recognizes a voice and a control unit that executes control of a vehicle according to the voice recognized at the first voice recognition unit. The portable terminal includes a second voice recognition unit that recognizes a voice, a specification unit that specifies control of the vehicle according to the voice recognized at the second voice recognition unit, an instruction unit that instructs the control unit to execute the specified control of the vehicle, and a notification unit that, in a case in which the control of the vehicle according to the voice recognized at the second voice recognition unit cannot be specified at the specification unit, performs notification according to at least an operation state of the first voice recognition unit.

Type: Grant

Filed: February 3, 2022

Date of Patent: September 10, 2024

Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA

Inventor: Ryuta Atsumi
Systems and methods for contactless authentication using voice recognition

Patent number: 12014740

Abstract: Systems and methods for contactless authorization using voice recognition is disclosed. The system may include one or more memory units storing instructions and one or more processors configured to execute the instructions to perform operations. The operations may include receiving user data comprising a user identifier, an audio data having a first data format, and a client device identifier. The operations may include generating a processed audio data based on the received audio data. The processed audio data may have a second data format. The operations may include transmitting, to a speech module, the processed audio data. The operations may include receiving from the speech module, a voice match result. In some embodiments, the operations include authenticating a user based on the voice match result and transmitting, to a client device associated with the client device identifier, a client notification comprising a result of the authentication.

Type: Grant

Filed: March 16, 2020

Date of Patent: June 18, 2024

Assignee: FIDELITY INFORMATION SERVICES, LLC

Inventors: Raghavendra Pratap Singh, Vijayendra Virendra Mishra
Language model adaptation

Patent number: 12014726

Abstract: Exemplary embodiments relate to adapting a generic language model during runtime using domain-specific language model data. The system performs an audio frame-level analysis, to determine if the utterance corresponds to a particular domain and whether the ASR hypothesis needs to be rescored. The system processes, using a trained classifier, the ASR hypothesis (a partial hypothesis) generated for the audio data processed so far. The system determines whether to rescore the hypothesis after every few audio frames (representing a word in the utterance) are processed by the speech recognition system.

Type: Grant

Filed: March 28, 2022

Date of Patent: June 18, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Ankur Gandhe, Ariya Rastrow, Roland Maximilian Rolf Maas, Bjorn Hoffmeister
Voice command system and voice command method

Patent number: 12002467

Abstract: A voice command system according to a first disclosure comprises a gateway apparatus having an interface configured to receive a voice command, and a controller configured to perform a registration process of registering a speaker permitted to receive the voice command. The controller is configured to perform an authentication process of rejecting a reception of the voice command when a speaker of the voice command is not registered, and permitting a reception of the voice command when a speaker of the voice command is registered. The controller is configured to perform the authentication process for each voice command.

Type: Grant

Filed: December 5, 2022

Date of Patent: June 4, 2024

Assignee: KYOCERA CORPORATION

Inventor: Yumiko Yamamoto
Input/output privacy tool

Patent number: 12001536

Abstract: Various examples described herein are directed to systems and methods for managing an interface between a user and a user computing device. The user computing device may determine that an audio sensor in communication with the user computing device indicates a first command in a user voice of the user, where the first command instructs the user computing device to perform a first task. The user computing device may determine that the audio sensor also indicates a first ambient voice different than the user voice and match the first ambient voice to a first known voice. The user computing device may determine that a second computing device associated with the first known voice is within a threshold distance of the user computing device and select a first privacy level for the first task based at least in part on the first known voice.

Type: Grant

Filed: June 10, 2022

Date of Patent: June 4, 2024

Assignee: Wells Fargo Bank, N.A.

Inventors: Tambra Nichols, Teresa Lynn Rench, Jonathan Austin Hartsell, John C. Brenner, Christopher James Williams
Detecting and suppressing commands in media that may trigger another automated assistant

Patent number: 11972766

Abstract: Techniques are described herein for detecting and suppressing commands in media that may trigger another automated assistant. A method includes: determining, for each of a plurality of automated assistant devices in an environment that are each executing at least one automated assistant, an active capability of the automated assistant device; initiating playback of digital media by an automated assistant; in response to initiating playback, processing the digital media to identify an audio segment in the digital media that, upon playback, is expected to trigger activation of at least one automated assistant executing on at least one of the plurality of automated assistant devices in the environment, based on the active capability of the at least one of the plurality of automated assistant devices; and in response to identifying the audio segment in the digital media, modifying the digital media to suppress the activation of the at least one automated assistant.

Type: Grant

Filed: January 23, 2023

Date of Patent: April 30, 2024

Assignee: GOOGLE LLC

Inventors: Matthew Sharifi, Victor Carbune
Fraud detection and analysis

Patent number: 11961095

Abstract: Systems and methods generate a risk score for an account event. The systems and methods automatically generate a causal model corresponding to a user, wherein the model estimates components of the causal model using event parameters of a previous event undertaken by the user in an account of the user. The systems and methods predict expected behavior of the user during a next event in the account using the causal model. Predicting the expected behavior of the user includes generating expected event parameters of the next event. The systems and methods use a predictive fraud model to generate fraud event parameters. Generation of the fraud event parameters assumes a fraudster is conducting the next event, wherein the fraudster is any person other than the user. The systems and methods generate a risk score of the next event to indicate the relative likelihood the future event is performed by the user.

Type: Grant

Filed: June 2, 2021

Date of Patent: April 16, 2024

Assignee: GUARDIAN ANALYTICS, INC.

Inventor: Tom Miltonberger
Speaker verification using co-location information

Patent number: 11942095

Abstract: A computer-implemented method that includes receiving audio data corresponding to an utterance of a voice command captured by a user device. The user device has a plurality of different users. The method includes determining a particular user among the plurality of different users of the user device as a speaker of the utterance based on a comparison between the audio data and corresponding speaker verification data stored on memory hardware for each user of the plurality of different users of the user device. The method further includes, based on determining the particular user among the plurality of different users of the user device as the speaker of the utterance, providing, for output from the user device, a message comprising a speaker identifier associated with the particular user.

Type: Grant

Filed: May 1, 2023

Date of Patent: March 26, 2024

Assignee: Google LLC

Inventors: Raziel Alvarez Guevara, Othar Hansson
Method and system for user voice identification using ensembled deep learning algorithms

Patent number: 11929078

Abstract: Certain embodiments of the present disclosure provide techniques training a user detection model to identify a user of a software application based on voice recognition. The method generally includes receiving a data set including a plurality of voice interactions with users of a software application. For each respective recording in the data set, a spectrogram representation is generated based on the respective recording. A plurality of voice recognition models are trained. Each of the plurality of voice recognition models is trained based on the spectrogram representation for each of the plurality of voice recordings in the data set. The plurality of voice recognition models are deployed to an interactive voice response system.

Type: Grant

Filed: February 23, 2021

Date of Patent: March 12, 2024

Assignee: Intuit, Inc.

Inventors: Shanshan Tuo, Divya Beeram, Meng Chen, Neo Yuchen, Wan Yu Zhang, Nivethitha Kumar, Kavita Sundar, Tomer Tal
Digital assistant hardware abstraction

Patent number: 11924254

Abstract: This relates to intelligent automated assistants and, more specifically, to intelligent context sharing and task performance among a collection of devices with intelligent automated assistant capabilities. An example method includes, at a first electronic device participating in a context-sharing group associated with a first location: receiving a user voice input; receiving, from a context collector, an aggregate context of the context-sharing group; providing at least a portion of the aggregate context and data corresponding to the user voice input to a remote device; receiving, from the remote device, a command to perform one or more tasks and a device identifier corresponding to a second electronic device; and transmitting the command to the second electronic device based on the device identifier, wherein the command causes the second electronic device to perform the one or more tasks.

Type: Grant

Filed: May 3, 2021

Date of Patent: March 5, 2024

Assignee: Apple Inc.

Inventors: Bryan Hansen, Nikrouz Ghotbi, Yifeng Gui, Xinyuan Huang, Benjamin S. Phipps, Eugene Ray, Mahesh Ramaray Shanbhag, Jaireh Tecarro, Sumit Wattal
User-assigned custom assistant responses to queries being submitted by another user

Patent number: 11922952

Abstract: Implementations set forth herein relate to an automated assistant that can be customized by a user to provide custom assistant responses to certain assistant queries, which may originate from other users. The user can establish certain custom assistant responses by providing an assistant response request to the automated assistant and/or responding to a request from the automated assistant to establish a particular custom assistant response. In some instances, a user can elect to establish a custom assistant response when the user determines or acknowledges that certain common queries are being submitted to the automated assistant—but the automated assistant is unable to resolve the common query. Establishing such custom assistant responses can therefore condense interactions between other users and the automated assistant.

Type: Grant

Filed: February 6, 2023

Date of Patent: March 5, 2024

Assignee: GOOGLE LLC

Inventors: Victor Carbune, Matthew Sharifi
Systems and methods for audio signal generation

Patent number: 11902759

Abstract: The present disclosure provides systems and methods for audio signal generation. The method may include obtaining first audio data collected by a bone conduction sensor; and obtaining second audio data collected by an air conduction sensor, the first audio data and the second audio data representing a speech of a user, with differing frequency component. The method may also include generating, based on the first audio data and the second audio data, third audio data, wherein frequency components of the third audio data higher than a frequency point increase with respect to frequency components of the first audio data higher than the first frequency point. In some embodiments, the method may further include determining, based on the third audio data, target audio data representing the speech of the user with better fidelity than the first audio data and the second audio data.

Type: Grant

Filed: January 29, 2022

Date of Patent: February 13, 2024

Assignee: SHENZHEN SHOKZ CO., LTD.

Inventors: Meilin Zhou, Fengyun Liao, Xin Qi
End-to-end multi-speaker audio-visual automatic speech recognition

Patent number: 11900919

Abstract: An audio-visual automated speech recognition model for transcribing speech from audio-visual data includes an encoder frontend and a decoder. The encoder includes an attention mechanism configured to receive an audio track of the audio-visual data and a video portion of the audio-visual data. The video portion of the audio-visual data includes a plurality of video face tracks each associated with a face of a respective person. For each video face track of the plurality of video face tracks, the attention mechanism is configured to determine a confidence score indicating a likelihood that the face of the respective person associated with the video face track includes a speaking face of the audio track. The decoder is configured to process the audio track and the video face track of the plurality of video face tracks associated with the highest confidence score to determine a speech recognition result of the audio track.

Type: Grant

Filed: March 21, 2023

Date of Patent: February 13, 2024

Assignee: Google LLC

Inventor: Otavio Braga
Call audio mixing processing

Patent number: 11887618

Abstract: A call audio mixing processing method is provided. In the method, call audio streams from terminals of call members participating in a call are obtained. Voice analysis is performed on the call audio streams to determine voice activity corresponding to each of the terminals. The voice activity of the terminals indicate activity levels of the call members participating in the call. According to the voice activity of the terminals, respective voice adjustment parameters corresponding to the terminals are determined. According to the respective voice adjustment parameters corresponding to the terminals, the call audio streams of the terminals are adjusted. Further, mixing processing is performed on the adjusted call audio streams to obtain a mixed audio stream.

Type: Grant

Filed: April 18, 2022

Date of Patent: January 30, 2024

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventor: Junbin Liang
Apparatus and method for processing an audio input recording to obtain a processed audio recording to address privacy issues

Patent number: 11887587

Abstract: An apparatus for processing an audio input recording to obtain a processed audio recording according to an embodiment is provided. The apparatus comprises an input interface (110) for receiving a plurality of audio input portions of the audio input recording. Moreover, the apparatus comprises a processor (120) for processing a plurality of audio input portions of the audio input recording to obtain a processed audio recording. The processor (120) is configured to determine, whether or not an audio input portion of the plurality of audio input portions comprises speech. If the processor (120) has detected that the audio input portion comprises speech, the processor (120) is configured to generate the processed audio recording by modifying the audio input portion to obtain a modified audio portion, and by generating the processed audio recording such that the processed audio recording comprises the modified audio portion instead of the audio input portion.

Type: Grant

Filed: April 14, 2021

Date of Patent: January 30, 2024

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventors: Jan Rennies-Hochmuth, Danilo Hollosi, Christian Rollwage, Jens-Ekkehart Appell
System and method for binding verifiable claims

Patent number: 11831409

Abstract: A system, apparatus, method, and machine readable medium are described for binding verifiable claims. For example, one embodiment of a system comprises: a client device; an authenticator of the client device to securely store authentication data including one or more verifiable claims received from one or more claim providers, each verifiable claim having attributes associated therewith; and claim/attribute processing logic to generate a first verifiable claim binding for a first verifiable claim issued by the claim provider; wherein the authenticator is to transmit a first signature assertion to a first relying party to authenticate with the first relying party, the first signature assertion including an attribute extension containing data associated with the first verifiable claim binding.

Type: Grant

Filed: January 10, 2019

Date of Patent: November 28, 2023

Assignee: NOK NOK LABS, INC.

Inventor: Rolf Lindemann
Secure transaction utilizing bone conductive characteristic

Patent number: 11823193

Abstract: Methods, computer program products, systems are provided for securely performing electronic transactions over a network. An acoustic wave signal is received. The acoustic wave signal is encoded with at least an identifier of a payer of a transaction and transaction information of the transaction. The identifier of the payer and the transaction information is then obtained from the acoustic wave signal. A bone conduction characteristic of the payer is retrieved based on the identifier of the payer and based on the received bone conduction characteristic of the payer, the payer is verified. Completion of the transaction is allowed based on the transaction information in response to the payer being verified.

Type: Grant

Filed: July 9, 2018

Date of Patent: November 21, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Ting Yin, Dong Chen, Ting Ting BJ Zhan, Xiang Juan Meng, Yin Xia
Guest access for voice control of playback devices

Patent number: 11790920

Abstract: Playback devices comprising a network interface, an optional speaker(s), and one or more processors are disclosed herein. In some embodiments, the playback device is configured to communicate with a computing system that stores configuration data corresponding to each of a plurality of users. The playback device detects one or more users near the playback device and retrieves user configuration data corresponding to each of the one or more detected users, and thereafter, uses the user configuration data of the one or more detected users to process voice commands, play media content, and/or perform other voice and/or media related functions.

Type: Grant

Filed: July 18, 2022

Date of Patent: October 17, 2023

Assignee: Sonos, Inc.

Inventor: Paul Andrew Bates
Cross-assistant command processing

Patent number: 11783824

Abstract: A speech-processing system may provide access to one or more virtual assistants via an audio-controlled device. A user may leverage a first virtual assistant to translate a natural language command from a first language into a second language, which the device can send to a second virtual assistant for processing. The device may receive a command from a user and send input data representing the command to a first speech-processing system representing the first virtual assistant. The device may receive a response in the form of a first natural language output from the first speech-processing system along with an indication that the first natural language output should be directed to a second speech-processing system representing the second virtual assistant. For example, the command may be in the first language, and the first natural language output may be in the second language, which is understandable by the second speech-processing system.

Type: Grant

Filed: February 5, 2021

Date of Patent: October 10, 2023

Assignee: Amazon Technologies, Inc.

Inventor: Robert John Mars
Systems and methods for power-efficient keyword detection

Patent number: 11769511

Abstract: Systems and methods for audio processing include capturing first sound data via at least one microphone of a network microphone device (NMD) and determining, via a voice activity detection process, that the first sound data does not include voice activity. The first sound data is stored in a buffer, and the NMD forgoes spatial processing of the first sound data. The NMD can capture second sound data and determine, via the voice activity process, that the second sound data includes voice activity. The NMD spatially processes the sound data to produce filtered sound data. The NMD detects a wake word based on data in the buffer. After detecting the wake word, the NMD may determine an action to be performed based on the data in the buffer.

Type: Grant

Filed: November 30, 2022

Date of Patent: September 26, 2023

Assignee: Sonos, Inc.

Inventors: Aaron Jones, Saeed Bagheri Sereshki, Daniele Giacobello
Voice analysis system

Patent number: 11756536

Abstract: [Object] To provide a highly accurate voice analysis system. [Solution] A voice analysis system 1 includes a first voice analysis terminal 3 and a second voice analysis terminal 5. The first voice analysis terminal 3 includes a first term analysis unit 7 that obtains first conversation information, a first conversation storage unit 9 that stores the first conversation information, a first analysis unit 11 that analyzes the first conversation information, a presentation storage unit 13, a related term storage unit 15, a display unit 17, a topic word storage unit 19, and a conversation information reception unit 25 that receives second conversation information from the second voice analysis terminal 5. The second voice analysis terminal 5 includes a second term analysis unit 21 that obtains the second conversation information and a second conversation storage unit 23.

Type: Grant

Filed: December 15, 2020

Date of Patent: September 12, 2023

Assignee: Interactive Solutions Corp.

Inventor: Kiyoshi Sekine
Attribute identification method, and program

Patent number: 11756554

Abstract: An attribute identification technology that can reject an attribute identification result if the reliability thereof is low is provided. An attribute identification device includes: a posteriori probability calculation unit 110 that calculates, from input speech, a posteriori probability sequence {q(c, i)} which is a sequence of the posteriori probabilities q(c, i) that a frame i of the input speech is a class c; a reliability calculation unit 120 that calculates, from the posteriori probability sequence {q(c, i)}, reliability r(c) indicating the extent to which the class c is a correct attribute identification result; and an attribute identification result generating unit 130 that generates an attribute identification result L of the input speech from the posteriori probability sequence {q(c, i)} and the reliability r(c).

Type: Grant

Filed: August 23, 2021

Date of Patent: September 12, 2023

Assignee: Nippon Telegraph and Telephone Corporation

Inventors: Hosana Kamiyama, Satoshi Kobashikawa, Atsushi Ando
Machine learning techniques for enhancing video conferencing applications

Patent number: 11748845

Abstract: Systems, processes, and techniques to automatically detect and enlarge a speaking one of plurality of participants on one side of a video conference. In at least one embodiment, the speaking participant is identified using one or more heuristics and/or one or more neural networks.

Type: Grant

Filed: January 27, 2021

Date of Patent: September 5, 2023

Assignee: NVIDIA Corporation

Inventors: Akarsh Umesh Zingade, Jianyuan Min, Shuye Han, Rochelle Pereira
System and method for detecting fraud rings

Patent number: 11735188

Abstract: A system and method may identify a fraud ring based on call or interaction data by analyzing by a computer processor interaction data including audio recordings to identify clusters of interactions which are suspected of involving fraud each cluster including the same speaker; analyzing by the computer processor the clusters, in combination with metadata associated with the interaction data, to identify fraud rings, each fraud ring describing a plurality of different speakers, each fraud ring defined by a set of speakers and a set of metadata corresponding to interactions including that speaker; and for each fraud ring, creating a relevance value defining the relative relevance of the fraud ring.

Type: Grant

Filed: September 12, 2022

Date of Patent: August 22, 2023

Assignee: Nice Ltd.

Inventors: Matan Keret, Anat Malin, Natan Katz, Shunit Metz, Sigal Lev, Jeremy Hoyland
Method and apparatus for multi-channel secure communication and data transfer

Patent number: 11727128

Abstract: A method, apparatus and computer program product are disclosed to provide for the selective establishment and use of secure communication channels to facilitate the exchange of data objects containing potentially sensitive information in a network environment. In some example implementations, upon detection that the processing of a network entity request implicates the exchange of non-public information amongst one or more other network entities, one or more secure communication channels are established between a secure transfer system and the relevant network entities such that non-public information neither passes to nor resides on system components associated with non-secure network entities.

Type: Grant

Filed: August 20, 2021

Date of Patent: August 15, 2023

Assignee: PAYMENTUS CORPORATION

Inventor: Dushyant Sharma
Multi-modal interaction with intelligent assistants in voice command devices

Patent number: 11721342

Abstract: A method comprising detecting an activation of an intelligent assistant on an electronic device, waking up the intelligent assistant from a sleep mode in response to the activation, and determining an amount of vocabulary the intelligent assistant acts upon during a listening mode based on a type of the activation.

Type: Grant

Filed: September 22, 2022

Date of Patent: August 8, 2023

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Jeffrey C. Olson, Henry N. Holtzman, Jean-David Hsu, Jeffrey A. Morgan
Multi-user authentication on a device

Patent number: 11721326

Abstract: In some implementations, processor(s) can receive an utterance from a speaker, and determine whether the speaker is a known user of a user device or not a known user of the user device. The user device can be shared by a plurality of known users. Further, the processor(s) can determine whether the utterance corresponds to a personal request or non-personal request. Moreover, and in response to determining that the speaker is not a known user of the user device and in response to determining that the utterance corresponds to a non-personal request, the processor(s) can cause a response to the utterance to be provided for presentation to the speaker at the user device response to the utterance, or can cause an action to be performed by the user device responsive to the utterance.

Type: Grant

Filed: January 26, 2022

Date of Patent: August 8, 2023

Assignee: GOOGLE LLC

Inventors: Meltem Oktem, Taral Pradeep Joglekar, Fnu Heryandi, Pu-sen Chao, Ignacio Lopez Moreno, Salil Rajadhyaksha, Alexander H. Gruenstein, Diego Melendo Casado
Machine learning multiple features of depicted item

Patent number: 11720622

Abstract: Machine learning multiple features of an item depicted in images. Upon accessing multiple images that depict the item, a neural network is used to machine train on the plurality of images to generate embedding vectors for each of multiple features of the item. For each of multiple features of the item depicted in the images, in each iteration of the machine learning, the embedding vector is converted into a probability vector that represents probabilities that the feature has respective values. That probability vector is then compared with a value vector representing the actual value of that feature in the depicted item, and an error between the two vectors is determined. That error is used to adjust parameters of the neural network used to generate the embedding vector, allowing for the next iteration in the generation of the embedding vectors. These iterative changes continue thereby training the neural network.

Type: Grant

Filed: June 9, 2022

Date of Patent: August 8, 2023

Inventors: Oren Barkan, Noam Razin, Noam Koenigstein, Roy Hirsch, Nir Nice
Contextual content for voice user interfaces

Patent number: 11699441

Abstract: The present disclosure describes techniques for dynamically determining when information is to be output to a user, as well as what information is to be output to a user. A natural language processing system may receive, from a first device, first data representing information to be output at a first point during a skill session. The natural language processing system may also receive, from a second device, second data representing a natural language input. The natural language processing system may determine a skill component is to execute with respect to the natural language input. The natural language processing system may send, to the skill component, second data representing the natural language input. The natural language processing system may receive, from the skill component, an indication that an ongoing first skill session with the second device has reached the first point.

Type: Grant

Filed: January 11, 2022

Date of Patent: July 11, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Mark Conrad Kockerbeck, Muhammad Yahia, Jordan Michael Hughes, Kevin Boehm, Rohit Sauhta
System and method for performing voice biometrics analysis

Patent number: 11675885

Abstract: In a system and method for audio analysis in a cloud-based computerized an authentication (RTA) manager micro-service may send an audio packet to a voice processor micro-service. The voice processor may extract features of the audio. The RTA manager may obtain the extracted features from the voice processor; calculate, based on the extracted features, a quality grade of the audio packet, and send the extracted features to an at least one voice biometrics engine if the quality grade is above a threshold. Each of the at least one voice biometrics engines may be configured to generate a voiceprint of the audio packet, based on the extracted features of the audio packet and to perform at least one of: authenticate a speaker, detect fraudsters, and enrich a previously stored voiceprint of the speaker with the voiceprint of the audio packet.

Type: Grant

Filed: September 27, 2021

Date of Patent: June 13, 2023

Assignee: Nice Ltd.

Inventors: Matan Keret, William Mark Finlay, Peter S Cardillo
Speaker verification using co-location information

Patent number: 11676608

Abstract: A method includes generating an audio signal encoding an utterance captured by a microphone of a user device and transmitting the audio signal encoding the utterance to a server. The server is configured to determine a speaker of the utterance from one of a plurality of different users of the user device based on a comparison between the audio signal encoding the utterance and corresponding speaker verification data, and process the audio signal encoding the utterance using a speech recognition module to identify a particular action. The method also includes executing the particular action identified by the server to cause a particular application to launch on the user device based on user permissions associated with the speaker determined by the server to access the particular data.

Type: Grant

Filed: April 2, 2021

Date of Patent: June 13, 2023

Assignee: Google LLC

Inventors: Raziel Alvarez Guevara, Othar Hansson
Voice processing method and electronic device supporting the same

Patent number: 11670302

Abstract: An electronic device and method are disclosed herein. The electronic device includes a network interface and processor. The processor implements the method, including receiving a voice input through a network interface as transmitted from a first external device, including a request to execute a function using at least one application which is not indicated in the voice input, extracting a first text from the voice input by executing automatic speech recognition (ASR), when the at least one application is identified based on the first text, transmitting, through the network interface to the first external device, second data associated with the identified at least one application for display by the first external device, and when the at least one application is not identified based at least in part on the first text, reattempting identification of the at least one application by executing natural language understanding (NLU) on the first text.

Type: Grant

Filed: November 13, 2020

Date of Patent: June 6, 2023

Assignee: Samsung Electronics Co., Ltd.

Inventors: Joo Hyuk Jeon, Woo Up Kwon, Jin Woo Park, Kyoung Gu Woo, Eun Taek Lim, Kyung Hak Hyun, Dong Ho Jang
Techniques for adjusting computing device sleep states

Patent number: 11662797

Abstract: This application relates to techniques that adjust the sleep states of a computing device based on proximity detection and predicted user activity. Proximity detection procedures can be used to determine a proximity between the computing device and a remote computing device coupled to the user. Based on these proximity detection procedures, the computing device can either correspondingly increase or decrease the amount power supplied to the various components during either a low-power sleep state or a high-power sleep state. Additionally, historical user activity data gathered on the computing device can be used to predict when the user will likely use the computing device. Based on the gathered historical user activity, deep sleep signals and light sleep signals can be issued at a time when the computing device is placed within a sleep state which can cause it to enter either a low-power sleep state or a high-power sleep state.

Type: Grant

Filed: February 17, 2022

Date of Patent: May 30, 2023

Inventors: Varaprasad V. Lingutla, Kartik R. Venkatraman, Marc J. Krochmal
Voice command detection and prediction

Patent number: 11657801

Abstract: Methods, systems, and apparatuses for predicting an end of a command in a voice recognition input are described herein. The system may receive data comprising a voice input. The system may receive a signal comprising a voice input. The system may detect, in the voice input, data that is associated with a first portion of a command. The system may predict, based on the first portion and while the voice input is being received, a second portion of the command. The prediction may be generated by a machine learning algorithm that is trained based at least in part on historical data comprising user input data. The system may cause execution of the command, based on the first portion and the predicted second portion, prior to an end of the voice input.

Type: Grant

Filed: February 27, 2019

Date of Patent: May 23, 2023

Assignee: Comcast Cable Communications, LLC

Inventors: Rui Min, Hongcheng Wang
Narrowband direction of arrival for full band beamformer

Patent number: 11651772

Abstract: A system and method for improving the performance of a hands-free voice user interface system while minimizing the computational complexity without sacrificing performance. Specifically, when estimating the location of the talker for the purpose of steering a directional beam in the direction of the active talker. A hands-free voice user interface system requires a clean signal to be streamed to the cloud for recognition. One way to improve the speech signal is to estimate where the talker is and steer a beam in the direction of the active talker. To locate the talker to a localized position, a direction of arrival estimator (DOA) algorithm is used. DoA generally requires noise and echo free signal for optimal estimation, but it is computationally expensive to run audio pre-processing such as an acoustic echo cancellation for each microphone in microphone array.

Type: Grant

Filed: January 28, 2022

Date of Patent: May 16, 2023

Assignee: DSP Concepts, Inc.

Inventors: Ke Li, Paul Beckmann
Electronic apparatus, controlling method of electronic apparatus and server

Patent number: 11646012

Abstract: An electronic apparatus which registers a device to a server by using a voice, and a method therefor are provided. The electronic apparatus includes a communication circuit, a microphone, a memory for storing computer executable instructions, and at least one processor configured to execute the computer executable instructions to acquire, from a voice received through the microphone, information on an external device which a user wishes to register, based on an external device corresponding to the acquired information being searched through the communication circuit, control the communication circuit to transmit information on an access point to the external device to enable the external device to communicate with a server, and control the communication circuit to transmit a registration request with respect to the external device to the server.

Type: Grant

Filed: August 19, 2022

Date of Patent: May 9, 2023

Assignee: Samsung Electronics Co., Ltd.

Inventors: Taejun Kwon, Seongil Hahm, Seungsoo Kang
Sample-efficient representation learning for real-time latent speaker state characterization

Patent number: 11646037

Abstract: Systems, methods, and non-transitory computer-readable media can provide audio waveform data that corresponds to a voice sample to a temporal convolutional network for evaluation. The temporal convolutional network can pre-process the audio waveform data and can output an identity embedding associated with the audio waveform data. The identity embedding associated with the voice sample can be obtained from the temporal convolutional network. Information describing a speaker associated with the voice sample can be determined based at least in part on the identity embedding.

Type: Grant

Filed: December 8, 2020

Date of Patent: May 9, 2023

Assignee: OTO Systems Inc.

Inventors: Valentin Alain Jean Perret, Nicolas Lucien Perony, Nándor Kedves
Communication notification management

Patent number: 11646020

Abstract: A method for managing electronic communication notifications includes responsive to receiving a communication from a first user, identifying one or more keywords in the communication based on a plurality of keywords associated with a plurality of queries previously presented by a second user. Determining whether the communication includes a reply to a first open query, wherein the first open query represents a question previously presented by the second user directed to the first user. Responsive to determining the communication from the first user includes the reply to the first open query, notifying the second user utilizing a first alert type for the communication from the first user that includes the reply for the first open query, wherein the first alert type is different from a second alert type for notifying the second user regarding a communication that does not include the reply for the first open query.

Type: Grant

Filed: January 24, 2020

Date of Patent: May 9, 2023

Assignee: International Business Machines Corporation

Inventors: Priyansh Jaiswal, Peeyush Jaiswal
Detection of replay attack

Patent number: 11631402

Abstract: A method of detecting a replay attack comprises: receiving an audio signal representing speech; identifying speech content present in at least a portion of the audio signal; obtaining information about a frequency spectrum of each portion of the audio signal for which speech content is identified; and, for each portion of the audio signal for which speech content is identified: retrieving information about an expected frequency spectrum of the audio signal; comparing the frequency spectrum of portions of the audio signal for which speech content is identified with the respective expected frequency spectrum; and determining that the audio signal may result from a replay attack if a measure of a difference between the frequency spectrum of the portions of the audio signal for which speech content is identified and the respective expected frequency spectrum exceeds a threshold level.

Type: Grant

Filed: May 7, 2020

Date of Patent: April 18, 2023

Assignee: Cirrus Logic, Inc.

Inventors: John Paul Lesso, César Alonso

1 2 3 4 5 … next