Patents Examined by Shreyans A Patel

Automatic speaker identification using speech recognition features

Patent number: 11900948

Abstract: Features are disclosed for automatically identifying a speaker. Artifacts of automatic speech recognition (“ASR”) and/or other automatically determined information may be processed against individual user profiles or models. Scores may be determined reflecting the likelihood that individual users made an utterance. The scores can be based on, e.g., individual components of Gaussian mixture models (“GMMs”) that score best for frames of audio data of an utterance. A user associated with the highest likelihood score for a particular utterance can be identified as the speaker of the utterance. Information regarding the identified user can be provided to components of a spoken language processing system, separate applications, etc.

Type: Grant

Filed: January 7, 2022

Date of Patent: February 13, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Hugh Evan Secker-Walker, Baiyang Liu, Frederick Victor Weber
Compressor target curve to avoid boosting noise

Patent number: 11894006

Abstract: The processing of audio signals during playback is provided, so that audio signals that fall below a specified threshold loudness level are processed to avoid making unwanted background noise audible. N-channel audio is received from a playback volume controller/leveler (101). The level of the audio is compared with a threshold level. If the level is greater than the threshold level, the audio is processed with a first amount of gain in accordance with a first dynamic range control (DRC) compression curve that is tuned for professionally produced audio. If the level is less than or equal to the threshold level, the audio is processed with a second amount of gain in accordance with a second DRC compression curve that is designed to avoid boosting unwanted background noise. After applying the gain to the audio, the audio is sent to a downstream device.

Type: Grant

Filed: July 18, 2019

Date of Patent: February 6, 2024

Assignee: DOLBY LABORATORIES LICENSING CORPORATION

Inventors: Zhongjin Wang, Andrew Peter Reilly, Michael William Mason
Media playback system with concurrent voice assistance

Patent number: 11893308

Abstract: Example techniques involve invoking voice assistance for a media playback system. In some embodiments, a NMD stores in memory a set of command information comprising a listing of playback commands and associated command criteria. The NMD captures a voice input and detects inclusion, within the voice input, of one or more particular playback commands from among the playback commands in the listing. In response, the NMD selects a local voice assistant that supports (a) one or more additional playback commands relative to a cloud-based VAS and (b) fewer non-playback commands relative to the cloud-based VAS, determines, via the local voice assistant, an intent in the captured voice input, and performs a response to the determined intent. The NMD foregoes selection of the cloud-based VAS when the local voice assistant is selected.

Type: Grant

Filed: March 28, 2022

Date of Patent: February 6, 2024

Assignee: Sonos, Inc.

Inventors: Dayn Wilberding, John Tolomei
Biasing voice correction suggestions

Patent number: 11881207

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for natural language processing. One of the method includes receiving a voice input from a user device; generating a recognition output; receiving a user selection of one or more terms in the recognition output; receiving a user input of one or more letters replacing the user selected one or more terms; determining suggested correction candidates based in part on the user input and the voice input; and providing one or more suggested correction candidates to the user device as suggested corrected recognition outputs.

Type: Grant

Filed: March 23, 2022

Date of Patent: January 23, 2024

Assignee: Google LLC

Inventors: Evgeny A. Cherepanov, Jakob Nicolaus Foerster, Vikram Sridar, Ishai Rabinovitz, Omer Tabach
Input device activation noise suppression

Patent number: 11875811

Abstract: A method includes receiving sound input features representative of sound received during an electronic conference, the sound including voice and input device activation sound, receiving an input event feature indicative of the input device activation, and processing the received sound input features and input event feature via a trained model to identify a stored spectral file to be subtracted from the received sound to suppress the input device activation sound.

Type: Grant

Filed: December 9, 2021

Date of Patent: January 16, 2024

Assignee: Lenovo (United States) Inc.

Inventors: Scott Wentao Li, Robert J. Kapinos, Robert James Norton, Jr., Russell Speight Vanblon
Information processing method, estimation model construction method, information processing device, and estimation model constructing device

Patent number: 11875777

Abstract: An information processing device includes a memory storing instructions, and a processor configured to implement the stored instructions to execute a plurality of tasks. The tasks includes: a first generating task that generates a series of fluctuations of a target sound based on first control data of the target sound to be synthesized, using a first model trained to have an ability to estimate a series of fluctuations of the target sound based on first control data of the target sound, and a second generating task that generates a series of features of the target sound based on second control data of the target sound and the generated series of fluctuations of the target sound, using a second model trained to estimate a series of features of the target sound based on second control data of the target sound and a series of fluctuations of the target sound.

Type: Grant

Filed: March 18, 2022

Date of Patent: January 16, 2024

Assignee: YAMAHA CORPORATION

Inventor: Ryunosuke Daido
Unsupervised alignment for text to speech synthesis using neural networks

Patent number: 11869483

Abstract: Generation of synthetic speech from an input text sequence may be difficult when durations of individual phonemes forming the input text sequence are unknown. A predominantly parallel process may model speech rhythm as a separate generative distribution such that phoneme duration may be sampled at inference. Additional information such as pitch or energy may also be sampled to provide improved diversity for synthetic speech generation.

Type: Grant

Filed: October 7, 2021

Date of Patent: January 9, 2024

Assignee: Nvidia Corporation

Inventors: Kevin Shih, Jose Rafael Valle Gomes da Costa, Rohan Badlani, Adrian Lancucki, Wei Ping, Bryan Catanzaro
Fulfillment of actionable requests ahead of a user selecting a particular autocomplete suggestion for completing a current user input

Patent number: 11868719

Abstract: Implementations set forth herein relate to providing selectable autofill suggestions, which correspond to application actions that are at least partially fulfilled using server command data—prior to a user selecting a particular selectable autofill suggestion. Proactively fulfilling command data in this way mitigates latency between user selection of a suggestion and fulfillment of a particular action. Initially, a partial input can be processed to generate autofill suggestions, which can be communicated to a server device for further processing. The autofill suggestions can also be rendered for selection at a touch display interface, thereby allowing a user to select one of the autofill suggestions. As command fulfillment data is provided by the server, the command fulfillment data can be available to a corresponding application(s) in order that any corresponding actions can be at least partially fulfilled prior to user selection.

Type: Grant

Filed: January 13, 2023

Date of Patent: January 9, 2024

Assignee: GOOGLE LLC

Inventor: Keun Soo Yim
Processing service notes

Patent number: 11861309

Abstract: Example techniques for processing service notes are described. In an example, labeled service notes, associated with fuser units of a plurality of image rendering devices, are processed to generate a vector corresponding to each of the labeled service notes, a labeled service note comprising natural language text describing an error event and a corresponding service activity associated with a fuser unit, wherein the labeled service note is assigned a label based on a category of failure of the fuser unit. Based on the processing, a relationship between vectors and labels corresponding to the labeled service notes is generated.

Type: Grant

Filed: November 25, 2019

Date of Patent: January 2, 2024

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: Anton Wiranata, Niranjan Damera Venkata, Prasad Hegde, Aravindakshan B
Always-on audio control for mobile device

Patent number: 11862173

Abstract: In an embodiment, an integrated circuit may include one or more CPUs, a memory controller, and a circuit configured to remain powered on when the rest of the SOC is powered down. The circuit may be configured to receive audio samples from a microphone, and match those audio samples against a predetermined pattern to detect a possible command from a user of the device that includes the SOC. In response to detecting the predetermined pattern, the circuit may cause the memory controller to power up so that audio samples may be stored in the memory to which the memory controller is coupled. The circuit may also cause the CPUs to be powered on and initialized, and the operating system (OS) may boot. During the time that the CPUs are initializing and the OS is booting, the circuit and the memory may be capturing the audio samples.

Type: Grant

Filed: May 27, 2021

Date of Patent: January 2, 2024

Assignee: Apple Inc.

Inventors: Timothy J. Millet, Manu Gulati, Michael F. Culbert
Electronic device and method for controlling thereof

Patent number: 11848004

Abstract: A method for controlling an electronic device includes obtaining a text, obtaining, by inputting the text into a first neural network model, acoustic feature information corresponding to the text and alignment information in which each frame of the acoustic feature information is matched with each phoneme included in the text, identifying an utterance speed of the acoustic feature information based on the alignment information, identifying a reference utterance speed for each phoneme included in the acoustic feature information based on the text and the acoustic feature information, obtaining utterance speed adjustment information based on the utterance speed of the acoustic feature information and the reference utterance speed for each phoneme, and obtaining, based on the utterance speed adjustment information, speech data corresponding to the text by inputting the acoustic feature information into a second neural network model.

Type: Grant

Filed: June 27, 2022

Date of Patent: December 19, 2023

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Sangjun Park, Kihyun Choo
Expandable dialogue system

Patent number: 11842724

Abstract: A method for training a dialogue learning model includes presenting, via a user interface of a computing device, an utterance and a list of actions based on the utterance. A selection of an action from the list of actions is received via the user interface. A designated span of the utterance is received via the user interface. The selected action and the designated span of the utterance is provided to a computing system for training the dialogue learning model.

Type: Grant

Filed: December 6, 2021

Date of Patent: December 12, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Percy Shuo Liang, David Leo Wright Hall, Joshua James Clausman
Predicting spectral representations for training speech synthesis neural networks

Patent number: 11830475

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network to perform speech synthesis. One of the methods includes obtaining a training data set for training a first neural network to process a spectral representation of an audio sample and to generate a prediction of the audio sample, wherein, after training, the first neural network obtains spectral representations of audio samples from a second neural network; for a plurality of audio samples in the training data set: generating a ground-truth spectral representation of the audio sample; and processing the ground-truth spectral representation using a third neural network to generate an updated spectral representation of the audio sample; and training the first neural network using the updated spectral representations, wherein the third neural network is configured to generate updated spectral representations that resemble spectral representations generated by the second neural network.

Type: Grant

Filed: June 1, 2022

Date of Patent: November 28, 2023

Assignee: DeepMind Technologies Limited

Inventor: Norman Casagrande
Learned condition text-to-speech synthesis

Patent number: 11830476

Abstract: Devices and techniques are generally described for learned condition text-to-speech synthesis. In some examples, first data representing a selection of a type of prosodic expressivity may be received. In some further examples, a selection of content comprising text data may be received. First audio data may be determined that includes an audio representation of the text data. The first audio data may be generated based at least in part on sampling from a first latent distribution generated using a conditional primary variational autoencoder (VAE). The sampling from the first latent distribution may be conditioned on a first learned distribution associated with the type of prosodic expressivity. In various examples, the first audio data may be sent to a first computing device.

Type: Grant

Filed: June 8, 2021

Date of Patent: November 28, 2023

Assignee: AMAZON TECHNOLOGIES, INC.

Inventors: Panagiota Karanasou, Sri Vishnu Kumar Karlapati, Alexis Pierre Moinet, Arnaud Vincent Pierre Yves Joly, Syed Ammar Abbas, Thomas Renaud Drugman, Jaime Lorenzo Trueba
Interaction with a portion of a content item through a virtual assistant

Patent number: 11823677

Abstract: Techniques for interacting with a portion of a content item through a virtual assistant are described herein. The techniques may include identifying a portion of a content item that is relevant to user input and causing an action to be performed related to the portion of the content item. The action may include, for example, displaying the portion of the content item on a smart device in a displayable format that is adapted to a display characteristic of the smart device, performing a task for a user that satisfies the user input, and so on.

Type: Grant

Filed: December 13, 2021

Date of Patent: November 21, 2023

Assignee: VERINT AMERICAS INC.

Inventors: Fred A. Brown, Tanya M. Miller
Method and an electronic device for processing a waveform

Patent number: 11817116

Abstract: A method and electronic device for processing a waveform are disclosed. The waveform is representative of bodily sounds. The method includes acquiring the waveform from the sound recording component and having a low-frequency component and a high-frequency component, selecting a target moving averaging filter amongst a first moving averaging filter and a second moving averaging filter for filtering the waveform. The first moving averaging filter is to be used for preserving the low-frequency component of the waveform, and the second moving averaging filter is to be used for preserving the high-frequency component of the waveform. The method includes applying the target moving averaging filter on the waveform for reducing noise in the waveform, thereby generating a second waveform.

Type: Grant

Filed: December 21, 2022

Date of Patent: November 14, 2023

Assignee: SPARROW ACOUSTICS INC.

Inventors: Yaroslav Shpak, Maksim Davydov
Virtual assistant host platform configured for interactive voice response simulation

Patent number: 11810565

Abstract: Aspects of the disclosure relate to using machine learning to simulate an interactive voice response system. A computing platform may receive user interaction information corresponding to interactions between a user and enterprise computing devices. Based on the user interaction information, the computing platform may identify predicted intents for the user, and may generate hotkey information based on the predicted intents. The computing platform may send the hotkey information and commands directing the mobile device to output the hotkey information. The computing platform may receive hotkey input information from the mobile device. Based on the hotkey input information, the computing platform may generate a hotkey response message. The computing platform may send, to the mobile device, the hotkey response message and commands directing the mobile device to convert the hotkey response message to an audio output and to output the audio output.

Type: Grant

Filed: July 27, 2022

Date of Patent: November 7, 2023

Assignee: Bank of America Corporation

Inventors: Srinivas Dundigalla, Pavan Chayanam, Saurabh Mehta
Agent device, agent system, and recording medium

Patent number: 11809885

Abstract: An agent device receives input information that is input by the user, in a case in which the input information is a question from the user, executes inference processing on the input information to infer an intent of the question in order to acquire a response to the question based on the intent, in a case in which a plurality of the responses are acquired, provides the notification device with option information that includes the plurality of responses as options, in a case in which new input information is received, determines whether the new input information is information requiring the inference processing or is selection information relating to a selection result from selection of the options, and in a case in which the new input information is the selection information, provides the notification device with response information regarding the response associated with the selection result without executing the inference processing.

Type: Grant

Filed: January 26, 2021

Date of Patent: November 7, 2023

Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA

Inventors: Eiichi Maeda, Chikage Kubo, Keiko Nakano, Hiroyuki Nishizawa
Allowing spelling of arbitrary words

Patent number: 11797763

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for natural language processing. One of the methods includes receiving a first voice input from a user device; generating a first recognition output; receiving a user selection of one or more terms in the first recognition output; receiving a second voice input spelling a correction of the user selection; determining a corrected recognition output for the selected portion; and providing a second recognition output that merges the first recognition output and the corrected recognition output.

Type: Grant

Filed: July 24, 2021

Date of Patent: October 24, 2023

Assignee: Google LLC

Inventors: Evgeny A. Cherepanov, Gleb Skobeltsyn, Jakob Nicolaus Foerster, Petar Aleksic, Assaf Avner Hurwitz Michaely
Systems and methods for synthesizing speech

Patent number: 11798527

Abstract: The present disclosure discloses a method for synthesizing a speech. The method includes generating the speech based on a text with a speech synthesis model, wherein the speech synthesis model includes an embedding layer, a speech synthesis layer, and a position layer; and training the speech synthesis model when an evaluation index meets a preset condition, wherein the evaluation index includes one or more quality indexes determined based on at least a part of the text and at least a part of the speech.

Type: Grant

Filed: August 18, 2021

Date of Patent: October 24, 2023

Assignee: ZHEJIANG TONGHU ASHUN INTELLIGENT TECHNOLOGY CO., LTD.

Inventors: Peng Zhang, Xinhui Hu, Xinkang Xu, Jian Lu

prev 1 2 3 4 5 6 … next