Speech Signal Processing Patents (Class 704/200)
  • Patent number: 12260866
    Abstract: A method, computer program product, and computing system for processing audio information associated with a speech processing system and encoding a watermark in a non-disruptive portion of the audio information.
    Type: Grant
    Filed: August 30, 2022
    Date of Patent: March 25, 2025
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Patrick Aubrey Naylor, Dushyant Sharma, William Francis Ganong, III, Uwe Helmut Jost, Ljubomir Milanovic
  • Patent number: 12260865
    Abstract: Provided a method performed by an automatic interpretation server based on a zero user interface (UI), which communicates with a plurality of terminal devices having a microphone function, a speaker function, a communication function, and a wearable function. The method includes connecting terminal devices disposed within a designated automatic interpretation zone, receiving a voice signal of a first user from a first terminal device among the terminal devices within the automatic interpretation zone, matching a plurality of users placed within a speech-receivable distance of the first terminal device, and performing automatic interpretation on the voice signal and transmitting results of the automatic interpretation to a second terminal device of at least one second user corresponding to a result of the matching.
    Type: Grant
    Filed: July 19, 2022
    Date of Patent: March 25, 2025
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Seung Yun, Sang Hun Kim, Min Kyu Lee, Joon Gyu Maeng
  • Patent number: 12255936
    Abstract: Methods, apparatus, and processor-readable storage media for augmenting identifying metadata related to group communication session participants using artificial intelligence techniques are provided herein.
    Type: Grant
    Filed: February 8, 2023
    Date of Patent: March 18, 2025
    Assignee: Dell Products L.P.
    Inventors: Dhilip S. Kumar, Hung T. Dinh, Bijan Kumar Mohanty
  • Patent number: 12248727
    Abstract: An audio device comprising memory, an interface, and one or more processors, wherein the one or more processors are configured to obtain audio data; process the audio data for provision of an audio output; process the audio data for provision of one or more audio parameters indicative of one or more characteristics of the audio data; map the one or more audio parameters to a first latent space of a first neural network for provision of a mapping parameter indicative of whether the one or more audio parameters belong to a training manifold of the first latent space; determine, based on the mapping parameter, an uncertainty parameter indicative of an uncertainty of processing quality; and control the processing of the audio data for provision of the audio output based on the uncertainty parameter.
    Type: Grant
    Filed: March 14, 2024
    Date of Patent: March 11, 2025
    Inventors: Clément Laroche, Diego Caviedes Nozal
  • Patent number: 12244752
    Abstract: A communication system and method usable to facilitate communication between a hearing user and an assisted user. In particular, the system employs a wireless portable tablet or other portable electronic computing device linked to a captioning enabled phone as a remote interface for that phone, thereby providing an assisted user with more options, more freedom, and improved usability of the system.
    Type: Grant
    Filed: September 27, 2023
    Date of Patent: March 4, 2025
    Assignee: ULTRATEC, INC.
    Inventors: Christopher R. Engelke, Kevin R. Colwell, Troy Vitek
  • Patent number: 12211488
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing video data using an adaptive visual speech recognition model. One of the methods includes receiving a video that includes a plurality of video frames that depict a first speaker: obtaining a first embedding characterizing the first speaker; and processing a first input comprising (i) the video and (ii) the first embedding using a visual speech recognition neural network having a plurality of parameters, wherein the visual speech recognition neural network is configured to process the video and the first embedding in accordance with trained values of the parameters to generate a speech recognition output that defines a sequence of one or more words being spoken by the first speaker in the video.
    Type: Grant
    Filed: June 15, 2022
    Date of Patent: January 28, 2025
    Assignee: DeepMind Technologies Limited
    Inventors: Ioannis Alexandros Assael, Brendan Shillingford, Joao Ferdinando Gomes de Freitas
  • Patent number: 12200450
    Abstract: A device to process speech includes a speech processing network that includes an input configured to receive audio data. The speech processing network also includes one or more network layers configured to process the audio data to generate a network output. The speech processing network includes an output configured to be coupled to multiple speech application modules to enable the network output to be provided as a common input to each of the multiple speech application modules.
    Type: Grant
    Filed: May 26, 2023
    Date of Patent: January 14, 2025
    Assignee: QUALCOMM Incorporated
    Inventors: Lae-Hoon Kim, Sunkuk Moon, Erik Visser, Prajakt Kulkarni
  • Patent number: 12198340
    Abstract: Provided are a system and a method for cardiovascular risk prediction, where artificial intelligence is utilized to perform segmentation on non-contrast or contrast medical images to identify precise regions of the heart, pericardium, and aorta of a subject, such that the adipose tissue volume and calcium score can be derived from the medical images to assist in cardiovascular risk prediction. Also provided is a computer readable medium for storing a computer executable code to implement the method.
    Type: Grant
    Filed: May 31, 2022
    Date of Patent: January 14, 2025
    Assignee: NATIONAL TAIWAN UNIVERSITY
    Inventors: Tzung-Dau Wang, Wen-Jeng Lee, Yu-Cheng Huang, Chiu-Wang Tseng, Cheng-Kuang Lee, Wei-Chung Wang, Cheng-Ying Chou
  • Patent number: 12189712
    Abstract: An exemplary method for detecting fake audios comprises: converting audio data into an image representation of the audio data; providing the image representation of the audio data to a trained machine-learning model, the machine learning model: generating, using a trained self-attention branch, one or more representation embeddings corresponding to the image representation of the audio data; and receiving, using a trained classifier component, the one or more representation embeddings and outputting a classification result. The machine-learning model is trained by: in a first stage, training one or more self- and cross-attention components via contrastive learning, each self- and cross-attention component comprises a first self-attention branch, a second self-attention branch, and a cross-attention branch; and in a second stage, training the classifier component; and providing the classification result.
    Type: Grant
    Filed: January 29, 2024
    Date of Patent: January 7, 2025
    Assignee: Reality Defender, Inc.
    Inventors: Gaurav Bharaj, Chirag Goel, Surya Koppisetti, Ben Colman, Ali Shahriyari
  • Patent number: 12183344
    Abstract: Systems, apparatuses, methods, and computer program products are disclosed for predicting an entity and intent based on captured speech. An example method includes capturing speech and converting the speech to text. The example method further includes causing generation of one or more entities and one or more intents based on the speech and the text. The example method further includes determining a next action based on each of the one or more entities and each of the one or more intents.
    Type: Grant
    Filed: November 24, 2021
    Date of Patent: December 31, 2024
    Assignee: Wells Fargo Bank, N.A.
    Inventors: Vinothkumar Venkataraman, Rahul Ignatius, Naveen Gururaja Yeri, Paul Davis
  • Patent number: 12175967
    Abstract: Touch-free, voice-assistant, and/or tracking systems and methods are described for automating inventory generation and tracking of childcare products. In various aspects, the touch-free, voice-assistant, and/or tracking systems and methods comprise receiving, by one or more processors, one or more name values corresponding to one or more children of a childcare program associated with at least one physical childcare location. A childcare product inventory, comprising a childcare product of at least one product type, is generated based on a count of the one or more name values. Child event data is received comprising information related to use of the childcare product. The child event data is based on audible input of a user as received via a voice command interface of a voice-assistant application (app) as implemented on a voice assistance device. The childcare product inventory may be updated based on the child event data.
    Type: Grant
    Filed: October 29, 2021
    Date of Patent: December 24, 2024
    Assignee: The Procter & Gamble Company
    Inventor: Brad S. Hoekzema
  • Patent number: 12142291
    Abstract: An action estimation device includes: an obtainer that obtains sound information pertaining to an inaudible sound, the inaudible sound being a sound in an ultrasonic band collected by a sound collector; and an estimator that estimates an output result, obtained by inputting the sound information obtained by the obtainer into a trained model indicating a relationship between the sound information and action information pertaining to an action of a person, as the action information of the person.
    Type: Grant
    Filed: June 21, 2022
    Date of Patent: November 12, 2024
    Assignee: PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AMERICA
    Inventors: Taketoshi Nakao, Toshiyuki Matsumura, Tatsumi Nagashima, Tetsuji Fuchikami
  • Patent number: 12136421
    Abstract: A speech recognition method includes receiving audible data and user data. The audible data includes information about an utterance by the user. The user data includes information about movements by the user. The method further includes fusing the audible data and the user data to obtain fused data and determining at least one spoken word of the utterance based on the fused data.
    Type: Grant
    Filed: March 3, 2022
    Date of Patent: November 5, 2024
    Assignee: GM GLOBAL TECHNOLOGY OPERATIONS LLC
    Inventors: Jacob Alan Bond, Hannah Elizabeth Wagner, Joseph F. Szczerba, Alan D. Hejl
  • Patent number: 12125496
    Abstract: The disclosed technology relates to methods, voice enhancement systems, and non-transitory computer readable media for real-time voice enhancement. In some examples, input audio data including foreground speech content, non-content elements, and speech characteristics is fragmented into input speech frames. The input speech frames are converted to low-dimensional representations of the input speech frames. One or more of the fragmentation or the conversion is based on an application of a first trained neural network to the input audio data. The low-dimensional representations of the input speech frames omit one or more of the non-content elements. A second trained neural network is applied to the low-dimensional representations of the input speech frames to generate target speech frames. The target speech frames are combined to generate output audio data. The output audio data further includes one or more portions of the foreground speech content and one or more of the speech characteristics.
    Type: Grant
    Filed: April 24, 2024
    Date of Patent: October 22, 2024
    Assignee: SANAS.AI INC.
    Inventors: Shawn Zhang, Lukas Pfeifenberger, Jason Wu, Piotr Dura, David Braude, Bajibabu Bollepalli, Alvaro Escudero, Gokce Keskin, Ankita Jha, Maxim Serebryakov
  • Patent number: 12106768
    Abstract: This application provides a speech signal processing method performed by a computer device. Through an iterative training process, a teacher speech separation model can play a smooth role in the training of a student speech separation model based on the accuracy of separation results of the student speech separation model of outputting a target speech signal from a mixed speech signal and the consistency between separation results obtained by the teacher speech separation model of outputting the target speech signal from the mixed speech signal and the student speech separation model of performing the same task, thereby maintaining the separation stability while improving the separation accuracy of the student speech separation model as a trained speech separation model, and greatly improving the separation capability of the trained speech separation model.
    Type: Grant
    Filed: February 17, 2022
    Date of Patent: October 1, 2024
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Jun Wang, Wingyip Lam
  • Patent number: 12106762
    Abstract: The application relates to HFR (High Frequency Reconstruction/Regeneration) of audio signals. In particular, the application relates to a method and system for performing HFR of audio signals having large variations in energy level across the low frequency range which is used to reconstruct the high frequencies of the audio signal. A system configured to generate a plurality of high frequency subband signals covering a high frequency interval from a plurality of low frequency subband signals is described.
    Type: Grant
    Filed: May 2, 2024
    Date of Patent: October 1, 2024
    Assignee: DOLBY INTERNATIONAL AB
    Inventor: Kristofer Kjoerling
  • Patent number: 12080319
    Abstract: The present disclosure provides a weakly-supervised sound event detection method and system based on adaptive hierarchical pooling. The system includes an acoustic model and an adaptive hierarchical pooling algorithm module (AHPA-model), where the acoustic model inputs a pre-processed and feature-extracted audio signal, and predicts a frame-level prediction probability aggregated by the AHPA-module to obtain a sentence-level prediction probability. The acoustic model and a relaxation parameter are jointly optimized to obtain an optimal model weight and an optimal relaxation parameter based for formulating each category of sound event. A pre-processed and feature-extracted unknown audio signal is input to obtain frame-level prediction probabilities of all target sound events (TSEs), and sentence-level prediction probabilities of all categories of TSEs are obtained based on an optimal pooling strategy of each category of TSE.
    Type: Grant
    Filed: June 27, 2022
    Date of Patent: September 3, 2024
    Assignee: Jiangsu University
    Inventors: Qirong Mao, Lijian Gao, Yaxin Shen, Qinghua Ren, Yongzhao Zhan, Keyang Cheng
  • Patent number: 12062374
    Abstract: Systems, devices, and methods transcribe words recorded in audio data. A computer-generated transcript is provided. The transcript comprises records for each word in the computer-generated transcript. At least one confirmation input is received for each record. The at least one confirmation input modifies a selected record and automatically identifies a next record for receiving a next confirmation input. A sequence of confirmation inputs may rapidly modify and validate each record in a sequence of records in the computer-generated transcript. A validated transcript is generated from the modified records and is provided from an evidence management system.
    Type: Grant
    Filed: February 7, 2023
    Date of Patent: August 13, 2024
    Assignee: Axon Enterprise, Inc.
    Inventors: Noah Spitzer-Williams, Choongyeun Cho, Thomas Crosley, Zachary Goist, Daniel Bellia, Vinh Nguyen, Chelsea Alexander-Taylor
  • Patent number: 12056517
    Abstract: Disclosed are an electronic device and a method for controlling thereof. According to an embodiment, a method for controlling an electronic apparatus includes: obtaining a voice command while a first application is executed in foreground; obtaining a text by recognizing the voice command; identifying at least one second application to perform the voice command based on the text; based on information of the first application and the at least one second application; identify whether to execute each of the first application and the at least one second application in the foreground or background of the electronic apparatus; and providing the first application and the at least one second application based on the identification.
    Type: Grant
    Filed: February 17, 2022
    Date of Patent: August 6, 2024
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Yeonho Lee, Sangwook Park, Youngbin Shin, Kookjin Yeo
  • Patent number: 12046241
    Abstract: The various implementations described herein include methods and systems for determining device leadership among voice interface devices. In one aspect, a method is performed at a first electronic device of a plurality of electronic devices, each having microphones, a speaker, processors, and memory storing programs for execution by the processors. The first device detects a voice input. It determines a device state and a relevance of the voice input. It identifies a subset of electronic devices from the plurality to which the voice input is relevant. In accordance with a determination that the subset includes the first device, the first device determines a first score of a criterion associated with the voice input and receives second scores of the criterion from other devices in the subset. In accordance with a determination that the first score is higher than the second scores, the first device responds to the detected input.
    Type: Grant
    Filed: May 4, 2023
    Date of Patent: July 23, 2024
    Assignee: Google LLC
    Inventors: Kenneth Mixter, Diego Melendo Casado, Alexander H. Gruenstein, Terry Tai, Christopher Thaddeus Hughes, Matthew Nirvan Sharifi
  • Patent number: 12033659
    Abstract: Disclosed is a method for determining important parts among a speech-to-text (STT) result and reference data, which is performed by a computing device. The method may include acquiring STT data generated by performing STT with respect to a speech signal; acquiring reference data; determining first important information in one data among the STT data and the reference data; and determining second important information linked with the first important information in other data different from data in which the first important information is determined among the STT data and the reference data.
    Type: Grant
    Filed: December 27, 2023
    Date of Patent: July 9, 2024
    Assignee: ActionPower Corp.
    Inventors: Hyungwoo Kim, Hwanbok Mun, Kangwook Kim
  • Patent number: 12033611
    Abstract: A system for use in video game development to generate expressive speech audio comprises a user interface configured to receive user-input text data and a user selection of a speech style. The system includes a machine-learned synthesizer comprising a text encoder, a speech style encoder and a decoder. The machine-learned synthesizer is configured to generate one or more text encodings derived from the user-input text data, using the text encoder of the machine-learned synthesizer; generate a speech style encoding by processing a set of speech style features associated with the selected speech style using the speech style encoder of the machine-learned synthesizer; combine the one or more text encodings and the speech style encoding to generate one or more combined encodings; and decode the one or more combined encodings with the decoder of the machine-learned synthesizer to generate predicted acoustic features.
    Type: Grant
    Filed: February 28, 2022
    Date of Patent: July 9, 2024
    Assignee: ELECTRONIC ARTS INC.
    Inventors: Siddharth Gururani, Kilol Gupta, Dhaval Shah, Zahra Shakeri, Jervis Pinto, Mohsen Sardari, Navid Aghdaie, Kazi Zaman
  • Patent number: 12027154
    Abstract: A method includes receiving a training example that includes audio data representing a spoken utterance and a ground truth transcription. For each word in the spoken utterance, the method also includes inserting a placeholder symbol before the respective word identifying a respective ground truth alignment for a beginning and an end of the respective word, determining a beginning word piece and an ending word piece, and generating a first constrained alignment for the beginning word piece and a second constrained alignment for the ending word piece. The first constrained alignment is aligned with the ground truth alignment for the beginning of the respective word and the second constrained alignment is aligned with the ground truth alignment for the ending of the respective word. The method also includes constraining an attention head of a second pass decoder by applying the first and second constrained alignments.
    Type: Grant
    Filed: February 9, 2023
    Date of Patent: July 2, 2024
    Assignee: Google LLC
    Inventors: Tara N. Sainath, Basilio Garcia Castillo, David Rybach, Trevor Strohman, Ruoming Pang
  • Patent number: 12027151
    Abstract: A linguistic content and speaking style disentanglement model includes a content encoder, a style encoder, and a decoder. The content encoder is configured to receive input speech as input and generate a latent representation of linguistic content for the input speech output. The content encoder is trained to disentangle speaking style information from the latent representation of linguistic content. The style encoder is configured to receive the input speech as input and generate a latent representation of speaking style for the input speech as output. The style encoder is trained to disentangle linguistic content information from the latent representation of speaking style. The decoder is configured to generate output speech based on the latent representation of linguistic content for the input speech and the latent representation of speaking style for the same or different input speech.
    Type: Grant
    Filed: November 18, 2021
    Date of Patent: July 2, 2024
    Assignee: Google LLC
    Inventors: Ruoming Pang, Andros Tjandra, Yu Zhang, Shigeki Karita
  • Patent number: 12027177
    Abstract: A computer-implemented method to determine whether to introduce latency into an audio stream from a particular speaker includes an audio stream from a sender device. The method further includes providing, as input to a trained machine-learning model, the audio stream and a speech analysis score, information about one or more voice emotion parameters, and one or more voice emotion scores for a first user associated with the sender device, wherein the trained machine-learning model is iteratively applied to the audio stream and wherein each iteration corresponds to a respective portion of the audio stream. The method further includes generating as output, with the trained machine-learning model, a level of toxicity in the audio stream. The method further includes transmitting the audio stream to a recipient device, wherein the transmitting is performed to introduce a time delay in the audio stream based on the level of toxicity.
    Type: Grant
    Filed: September 8, 2022
    Date of Patent: July 2, 2024
    Assignee: Roblox Corporation
    Inventors: Mahesh Kumar Nandwana, Philippe Clavel, Morgan McGuire
  • Patent number: 12020708
    Abstract: Methods and systems for enabling an efficient review of meeting content via a metadata-enriched, speaker-attributed transcript are disclosed. By incorporating speaker diarization and other metadata, the system can provide a structured and effective way to review and/or edit the transcript. One type of metadata can be image or video data to represent the meeting content. Furthermore, the present subject matter utilizes a multimodal diarization model to identify and label different speakers. The system can synchronize various sources of data, e.g., audio channel data, voice feature vectors, acoustic beamforming, image identification, and extrinsic data, to implement speaker diarization.
    Type: Grant
    Filed: October 11, 2021
    Date of Patent: June 25, 2024
    Assignee: SoundHound AI IP, LLC.
    Inventors: Kiersten L. Bradley, Ethan Coeytaux, Ziming Yin
  • Patent number: 12021822
    Abstract: A computer-implemented method includes receiving a communication between first and second users via a communication channel associated with a communication space, and identifying the first user having a first role and the second user having a second role, a formality of the communication is determined based on the second role. The method includes identifying a transformer model for the communication space and monitoring the communication for an agreement clause via the transformer model by deriving an agreement clause based on the communication and classifying the derived agreement clause.
    Type: Grant
    Filed: October 4, 2022
    Date of Patent: June 25, 2024
    Assignee: International Business Machines Corporation
    Inventors: Aaron K. Baughman, Jeremy R. Fox, Raghuveer Prasad Nagar, Dinesh Kumar Bhudavaram
  • Patent number: 12020405
    Abstract: A computer-implemented method for training a convolutional neural network includes receiving a captured image. A denoised image is generated by applying the convolutional neural network to the captured image. The convolutional neural network is trained based on a high frequency loss function, as well as the captured image and the denoised image.
    Type: Grant
    Filed: November 3, 2021
    Date of Patent: June 25, 2024
    Assignee: LEICA MICROSYSTEMS CMS GMBH
    Inventor: Jose Miguel Serra Lleti
  • Patent number: 12013884
    Abstract: A modular two-stage neural architecture is used in translating a natural language question into a logic form such as a SPARQL Protocol and RDF Query Language (SPARQL) query. In a first stage, a neural machine translation (NMT)-based sequence-to-sequence (Seq2Seq) model translates a question into a sketch of the desired SPARQL query called a SPARQL silhouette. In a second stage a neural graph search module predicts the correct relations in the underlying knowledge graph.
    Type: Grant
    Filed: June 30, 2022
    Date of Patent: June 18, 2024
    Assignee: International Business Machines Corporation
    Inventors: Saswati Dana, Dinesh Garg, Dinesh Khandelwal, G P Shrivatsa Bhargav, Sukannya Purkayastha
  • Patent number: 11993817
    Abstract: The present invention relates to coding of audio signals, and in particular to high frequency reconstruction methods including a frequency domain harmonic transposer. A system and method for generating a high frequency component of a signal from a low frequency component of the signal is described.
    Type: Grant
    Filed: January 19, 2023
    Date of Patent: May 28, 2024
    Assignee: Dolby International AB
    Inventors: Lars Villemoes, Per Ekstrand
  • Patent number: 11961525
    Abstract: This document generally describes systems, methods, devices, and other techniques related to speaker verification, including (i) training a neural network for a speaker verification model, (ii) enrolling users at a client device, and (iii) verifying identities of users based on characteristics of the users' voices. Some implementations include a computer-implemented method. The method can include receiving, at a computing device, data that characterizes an utterance of a user of the computing device. A speaker representation can be generated, at the computing device, for the utterance using a neural network on the computing device. The neural network can be trained based on a plurality of training samples that each: (i) include data that characterizes a first utterance and data that characterizes one or more second utterances, and (ii) are labeled as a matching speakers sample or a non-matching speakers sample.
    Type: Grant
    Filed: August 3, 2021
    Date of Patent: April 16, 2024
    Assignee: Google LLC
    Inventors: Georg Heigold, Samuel Bengio, Ignacio Lopez Moreno
  • Patent number: 11947593
    Abstract: A system, method, and computer program product for hierarchical categorization of sound comprising one or more neural networks implemented on one or more processors. The one or more neural networks are configured to categorize a sound into a two or more tiered hierarchical coarse categorization and a finest level categorization in the hierarchy. The categorization sound may be used to search a database for similar or contextually related sounds.
    Type: Grant
    Filed: September 28, 2018
    Date of Patent: April 2, 2024
    Inventors: Arindam Jati, Naveen Kumar, Ruxin Chen
  • Patent number: 11908483
    Abstract: This application relates to a method of extracting an inter channel feature from a multi-channel multi-sound source mixed audio signal performed at a computing device.
    Type: Grant
    Filed: August 12, 2021
    Date of Patent: February 20, 2024
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Rongzhi Gu, Shixiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Dong Yu
  • Patent number: 11903067
    Abstract: An audio forwarding method, an audio forwarding method device and a storage medium are described. The audio forwarding method comprises: establishing a first communication link with a sound source device based on a first wireless communication protocol; establishing a second communication link with an audio playback device based on a second wireless communication protocol; receiving first audio data from the sound source device through the first communication link; processing the first audio data to generate second audio data, and storing the second audio data into a second buffer; and transmitting the second audio data to the audio playback device through the second communication link.
    Type: Grant
    Filed: June 24, 2022
    Date of Patent: February 13, 2024
    Assignee: Nanjing Zgmicro Company Limited
    Inventor: Bin Xu
  • Patent number: 11894015
    Abstract: An embedded sensor can include an audio detector, a digital signal processor, a library, and a rules engine. The digital signal processor can be configured to receive signals from the audio detector and to identify the environment in which the embedded sensor is located. The library can store statistical models associated with specific environments, and the digital signal processor can be configured identify specific events based on detected sounds within the particular environment by utilizing the statistical model associated with the particular environment. The DSP can associate a probability of accuracy for the identified audible event. A rules engine can be configured to receive the probability and transmit a report of the detected audible event.
    Type: Grant
    Filed: October 31, 2022
    Date of Patent: February 6, 2024
    Assignee: CELLULAR SOUTH, INC.
    Inventors: Brett Rogers, Tommy Naugle, Stephen Bye, Craig Sparks, Arman Kirakosyan
  • Patent number: 11854561
    Abstract: The invention provides an audio encoder including a combination of a linear predictive coding filter having a plurality of linear predictive coding coefficients and a time-frequency converter, wherein the combination is configured to filter and to convert a frame of the audio signal into a frequency domain in order to output a spectrum based on the frame and on the linear predictive coding coefficients; a low frequency emphasizer configured to calculate a processed spectrum based on the spectrum, wherein spectral lines of the processed spectrum representing a lower frequency than a reference spectral line are emphasized; and a control device configured to control the calculation of the processed spectrum by the low frequency emphasizer depending on the linear predictive coding coefficients of the linear predictive coding filter.
    Type: Grant
    Filed: November 22, 2022
    Date of Patent: December 26, 2023
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Stefan Doehla, Bernhard Grill, Christian Helmrich, Nikolaus Rettelbach
  • Patent number: 11823670
    Abstract: Utterance-based user interfaces can include activation trigger processing techniques for detecting activation triggers and causing execution of certain commands associated with particular command pattern activation triggers without waiting for output from a separate speech processing engine. The activation trigger processing techniques can also detect speech analysis patterns and selectively activate a speech processing engine.
    Type: Grant
    Filed: April 17, 2020
    Date of Patent: November 21, 2023
    Assignee: Spotify AB
    Inventor: Richard Mitic
  • Patent number: 11810593
    Abstract: A system configured to perform low power mode wakeword detection is provided. A device reduces power consumption without compromising functionality by placing a primary processor into a low power mode and using a secondary processor to monitor for sound detection. The secondary processor stores input audio data in a buffer component while performing sound detection on the input audio data. If the secondary processor detects a sound, the secondary processor sends an interrupt signal to the primary processor, causing the primary processor to enter an active mode. While in the active mode, the primary processor performs wakeword detection using the buffered audio data. To reduce a latency, the primary processor processes the buffered audio data at an accelerated rate. In some examples, the device may further reduce power consumption by including a second buffer component and only processing the input audio data after detecting a sound.
    Type: Grant
    Filed: November 6, 2020
    Date of Patent: November 7, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Dibyendu Nandy, Om Prakash Gangwal
  • Patent number: 11810572
    Abstract: A system, method, and computer-program product includes distributing a plurality of audio data files of a speech data corpus to a plurality of computing nodes that each implement a plurality of audio processing threads, executing the plurality of audio processing threads associated with each of the plurality of computing nodes to detect a plurality of tentative speakers participating in each of the plurality of audio data files, generating, via a clustering algorithm, a plurality of clusters of embedding signatures based on a plurality of embedding signatures associated with the plurality of tentative speakers in each of the plurality of audio data files, and detecting a plurality of global speakers associated with the speech data corpus based on the plurality of clusters of embedding signatures.
    Type: Grant
    Filed: June 8, 2023
    Date of Patent: November 7, 2023
    Assignee: SAS INSTITUTE INC.
    Inventors: Xiaozhuo Cheng, Xiaolong Li, Xu Yang
  • Patent number: 11804238
    Abstract: An optimization method for an implementation of mel-frequency cepstral coefficients is provided. The optimization method includes the following steps: performing a framing step, including using a 400Ă—16 static random access memory to temporarily store a plurality of sampling points of a sound signal with overlap, and decomposing the sound signal into a plurality of frames. Each of the plurality of frames is 400 of the sampling points, there is an overlapping region between adjacent two of the plurality of frames, and the overlapping region includes 240 of the sampling points. The optimization method further includes performing a windowing step, which includes multiplying each of the plurality of frames by a window function in a bit-level design, and the optimization method includes performing a fast Fourier transform (FFT) step, which includes applying a 512 point FFT on a frame signal to obtain a corresponding frequency spectrum.
    Type: Grant
    Filed: October 29, 2021
    Date of Patent: October 31, 2023
    Assignee: REALTEK SEMICONDUCTOR CORP.
    Inventors: Li-Li Tan, Zhi-Lin Wang, Xiao-Feng Cao, Xiao-Huan Li
  • Patent number: 11798566
    Abstract: The present disclosure discloses a data transmission method performed by a computer device and a non-transitory computer-readable storage medium. According to the present disclosure, voice criticality analysis is performed on a to-be-transmitted audio to obtain a criticality level of each to-be-transmitted audio frame in the to-be-transmitted audio, and a corrected redundancy multiple of each to-be-transmitted audio frame is obtained according to a current redundancy multiple and a redundant transmission factor corresponding to the criticality level of each to-be-transmitted audio frame. Therefore, each to-be-transmitted audio frame is duplicated according to a corrected redundancy multiple of each to-be-transmitted audio frame, to obtain at least one redundancy data packet, and the at least one redundancy data packet is transmitted to a target terminal, which can improve the network anti-packet loss effect without causing network congestion.
    Type: Grant
    Filed: October 28, 2021
    Date of Patent: October 24, 2023
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventor: Junbin Liang
  • Patent number: 11790924
    Abstract: In a stereo encoding method, a channel combination encoding solution of a current frame is first obtained, and then a quantized channel combination ratio factor of the current frame and an encoding index of the quantized channel combination ratio factor are obtained based on the obtained channel combination encoding solution, so that an obtained primary channel signal and secondary channel signal of the current frame meet a characteristic of the current frame.
    Type: Grant
    Filed: November 9, 2022
    Date of Patent: October 17, 2023
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Bin Wang, Haiting Li, Lei Miao
  • Patent number: 11784712
    Abstract: A cell site test tool provides field technicians with resources to support multiple aspects of cell site testing. The cell site test tool includes multiple, integrated and removably connectable modules such as a base module, a user interface module, and a battery module. Additional modules include a CPRI module to provide Common Public Radio Interface testing, an OTDR module to provide dedicated Optical Time-Domain Reflectometer testing, a CAA module to provide Cable Antenna Analysis testing, a fiber inspection module to visually inspect optical fiber, and an SA/CPRI module to provide Radio Frequency over Common Public Radio Interface testing.
    Type: Grant
    Filed: March 29, 2022
    Date of Patent: October 10, 2023
    Assignee: VIAVI SOLUTIONS INC.
    Inventors: Reza Vaez-Ghaemi, Craig Stephen Boledovic, Andrew Thomas Rayno, Waleed Wardak, Michael Jon Bangert, Jr., Kanwaljit Singh Rekhi
  • Patent number: 11739641
    Abstract: A method for processing speech, comprising semantically parsing a received natural language speech input with respect to a plurality of predetermined command grammars in an automated speech processing system; determining if the parsed speech input unambiguously corresponds to a command and is sufficiently complete for reliable processing, then processing the command; if the speech input ambiguously corresponds to a single command or is not sufficiently complete for reliable processing, then prompting a user for further speech input to reduce ambiguity or increase completeness, in dependence on a relationship of previously received speech input and at least one command grammar of the plurality of predetermined command grammars, reparsing the further speech input in conjunction with previously parsed speech input, and iterating as necessary. The system also monitors abort, fail or cancel conditions in the speech input.
    Type: Grant
    Filed: April 13, 2021
    Date of Patent: August 29, 2023
    Assignee: Great Northern Research, LLC
    Inventors: Philippe Roy, Paul J. Lagassey
  • Patent number: 11711589
    Abstract: The present disclosure relates to a method and system for presenting a set of control functions via an interface of a peripheral control device (PCD). A control function can include a command associated with one or more media contexts of a host media device. The method decodes a payload, from the host media device, with an encoded context identifier, where the context identifier indicates a primary media context active on the host media device. The method determines one or more control functions corresponding to the context identifier, and changes the set of control functions on the interface of the PCD to include the one or more control functions that can command the primary media context.
    Type: Grant
    Filed: December 15, 2021
    Date of Patent: July 25, 2023
    Assignee: NAGRAVISION S.A.
    Inventors: Amudha Kaliamoorthi, Prabhu Chawandi, Karthikeyan Srinivasan, Jihyun Park, Jun Seo Lee
  • Patent number: 11692907
    Abstract: Dishwashing appliances and methods, as provided herein, may include features or steps such as measuring a first pressure in a sump with a pressure sensor and storing the first pressure in a memory of the dishwashing appliance as a reference pressure. Dishwashing appliances and methods may further include features or steps for measuring a second pressure within the sump with the pressure sensor after measuring the first pressure, and determining that a check valve is failed when the second pressure exceeds the first pressure by at least a predetermined margin.
    Type: Grant
    Filed: June 25, 2020
    Date of Patent: July 4, 2023
    Assignee: Haier US Appliance Solutions, Inc.
    Inventor: Kyle Edward Durham
  • Patent number: 11689484
    Abstract: The disclosed exemplary embodiments include computer-implemented systems, apparatuses, and processes that dynamically configure and populate a digital interface based on sequential elements of message data exchanged during a chatbot session established programmatically between an apparatus and a device. For example, the apparatus may generate first messaging data that includes a candidate input value for an interface element of a digital interface, and transmit the first messaging data to the device during the programmatically established chatbot session. The apparatus may also receive, from the device during the programmatically established chatbot session, second messaging data that includes a confirmation of the candidate input value. Based on the second messaging data, the apparatus may generate populated interface data that associates the interface element with the confirmed candidate input value, and store the populated interface data within a memory.
    Type: Grant
    Filed: September 18, 2019
    Date of Patent: June 27, 2023
    Assignee: The Toronto-Dominion Bank
    Inventors: Tae Gyun Moon, Robert Alexander McCarter, Kheiver Kayode Roberts
  • Patent number: 11676580
    Abstract: An electronic device is provided. The electronic device includes a microphone, and at least one processor operatively connected to the microphone, wherein the at least one processor may include a buffer memory configured to store a first feature vector for a first voice signal obtained from the microphone as an inverse value, and an operation circuit configured to perform a norm operation for a first feature vector and a second feature vector, based on the second feature vector, based on a second voice signal streamed from the microphone and an inverse value of the first feature vector stored in the buffer memory, or calculate a similarity between the first feature vector and the second feature vector. In addition, various embodiments identified through the specification are possible.
    Type: Grant
    Filed: April 30, 2021
    Date of Patent: June 13, 2023
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hyunbin Park, Jin Choi
  • Patent number: 11670297
    Abstract: The various implementations described herein include methods and systems for determining device leadership among voice interface devices. In one aspect, a method is performed at a first electronic device of a plurality of electronic devices, each having microphones, a speaker, processors, and memory storing programs for execution by the processors. The first device detects a voice input. It determines a device state and a relevance of the voice input. It identifies a subset of electronic devices from the plurality to which the voice input is relevant. In accordance with a determination that the subset includes the first device, the first device determines a first score of a criterion associated with the voice input and receives second scores of the criterion from other devices in the subset. In accordance with a determination that the first score is higher than the second scores, the first device responds to the detected input.
    Type: Grant
    Filed: April 27, 2021
    Date of Patent: June 6, 2023
    Assignee: Google LLC
    Inventors: Kenneth Mixter, Diego Melendo Casado, Alexander Houston Gruenstein, Terry Tai, Christopher Thaddeus Hughes, Matthew Nirvan Sharifi
  • Patent number: 11663183
    Abstract: A method includes generating from a time-series dataset multiple corresponding time-slice datasets. Each time-slice dataset has a corresponding time-slice time index and includes field-value data strings and associated field-value-time-index data strings, or pointers indicating the corresponding strings in an earlier time-slice dataset, that are the latest in the time-series dataset that are also earlier than the corresponding time-slice time index. A query of the time-series dataset for latest data records earlier than a given query time index is performed by using the time-slice datasets to reduce or eliminate the need to directly access or interrogate the time-series dataset.
    Type: Grant
    Filed: August 23, 2021
    Date of Patent: May 30, 2023
    Assignee: MOONSHADOW MOBILE, INC.
    Inventors: Roy W. Ward, David S. Alavi