Patents Examined by Shreyans A Patel
  • Patent number: 11900948
    Abstract: Features are disclosed for automatically identifying a speaker. Artifacts of automatic speech recognition (“ASR”) and/or other automatically determined information may be processed against individual user profiles or models. Scores may be determined reflecting the likelihood that individual users made an utterance. The scores can be based on, e.g., individual components of Gaussian mixture models (“GMMs”) that score best for frames of audio data of an utterance. A user associated with the highest likelihood score for a particular utterance can be identified as the speaker of the utterance. Information regarding the identified user can be provided to components of a spoken language processing system, separate applications, etc.
    Type: Grant
    Filed: January 7, 2022
    Date of Patent: February 13, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Hugh Evan Secker-Walker, Baiyang Liu, Frederick Victor Weber
  • Patent number: 11894006
    Abstract: The processing of audio signals during playback is provided, so that audio signals that fall below a specified threshold loudness level are processed to avoid making unwanted background noise audible. N-channel audio is received from a playback volume controller/leveler (101). The level of the audio is compared with a threshold level. If the level is greater than the threshold level, the audio is processed with a first amount of gain in accordance with a first dynamic range control (DRC) compression curve that is tuned for professionally produced audio. If the level is less than or equal to the threshold level, the audio is processed with a second amount of gain in accordance with a second DRC compression curve that is designed to avoid boosting unwanted background noise. After applying the gain to the audio, the audio is sent to a downstream device.
    Type: Grant
    Filed: July 18, 2019
    Date of Patent: February 6, 2024
    Assignee: DOLBY LABORATORIES LICENSING CORPORATION
    Inventors: Zhongjin Wang, Andrew Peter Reilly, Michael William Mason
  • Patent number: 11893308
    Abstract: Example techniques involve invoking voice assistance for a media playback system. In some embodiments, a NMD stores in memory a set of command information comprising a listing of playback commands and associated command criteria. The NMD captures a voice input and detects inclusion, within the voice input, of one or more particular playback commands from among the playback commands in the listing. In response, the NMD selects a local voice assistant that supports (a) one or more additional playback commands relative to a cloud-based VAS and (b) fewer non-playback commands relative to the cloud-based VAS, determines, via the local voice assistant, an intent in the captured voice input, and performs a response to the determined intent. The NMD foregoes selection of the cloud-based VAS when the local voice assistant is selected.
    Type: Grant
    Filed: March 28, 2022
    Date of Patent: February 6, 2024
    Assignee: Sonos, Inc.
    Inventors: Dayn Wilberding, John Tolomei
  • Patent number: 11881207
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for natural language processing. One of the method includes receiving a voice input from a user device; generating a recognition output; receiving a user selection of one or more terms in the recognition output; receiving a user input of one or more letters replacing the user selected one or more terms; determining suggested correction candidates based in part on the user input and the voice input; and providing one or more suggested correction candidates to the user device as suggested corrected recognition outputs.
    Type: Grant
    Filed: March 23, 2022
    Date of Patent: January 23, 2024
    Assignee: Google LLC
    Inventors: Evgeny A. Cherepanov, Jakob Nicolaus Foerster, Vikram Sridar, Ishai Rabinovitz, Omer Tabach
  • Patent number: 11875811
    Abstract: A method includes receiving sound input features representative of sound received during an electronic conference, the sound including voice and input device activation sound, receiving an input event feature indicative of the input device activation, and processing the received sound input features and input event feature via a trained model to identify a stored spectral file to be subtracted from the received sound to suppress the input device activation sound.
    Type: Grant
    Filed: December 9, 2021
    Date of Patent: January 16, 2024
    Assignee: Lenovo (United States) Inc.
    Inventors: Scott Wentao Li, Robert J. Kapinos, Robert James Norton, Jr., Russell Speight Vanblon
  • Patent number: 11875777
    Abstract: An information processing device includes a memory storing instructions, and a processor configured to implement the stored instructions to execute a plurality of tasks. The tasks includes: a first generating task that generates a series of fluctuations of a target sound based on first control data of the target sound to be synthesized, using a first model trained to have an ability to estimate a series of fluctuations of the target sound based on first control data of the target sound, and a second generating task that generates a series of features of the target sound based on second control data of the target sound and the generated series of fluctuations of the target sound, using a second model trained to estimate a series of features of the target sound based on second control data of the target sound and a series of fluctuations of the target sound.
    Type: Grant
    Filed: March 18, 2022
    Date of Patent: January 16, 2024
    Assignee: YAMAHA CORPORATION
    Inventor: Ryunosuke Daido
  • Patent number: 11869483
    Abstract: Generation of synthetic speech from an input text sequence may be difficult when durations of individual phonemes forming the input text sequence are unknown. A predominantly parallel process may model speech rhythm as a separate generative distribution such that phoneme duration may be sampled at inference. Additional information such as pitch or energy may also be sampled to provide improved diversity for synthetic speech generation.
    Type: Grant
    Filed: October 7, 2021
    Date of Patent: January 9, 2024
    Assignee: Nvidia Corporation
    Inventors: Kevin Shih, Jose Rafael Valle Gomes da Costa, Rohan Badlani, Adrian Lancucki, Wei Ping, Bryan Catanzaro
  • Patent number: 11868719
    Abstract: Implementations set forth herein relate to providing selectable autofill suggestions, which correspond to application actions that are at least partially fulfilled using server command data—prior to a user selecting a particular selectable autofill suggestion. Proactively fulfilling command data in this way mitigates latency between user selection of a suggestion and fulfillment of a particular action. Initially, a partial input can be processed to generate autofill suggestions, which can be communicated to a server device for further processing. The autofill suggestions can also be rendered for selection at a touch display interface, thereby allowing a user to select one of the autofill suggestions. As command fulfillment data is provided by the server, the command fulfillment data can be available to a corresponding application(s) in order that any corresponding actions can be at least partially fulfilled prior to user selection.
    Type: Grant
    Filed: January 13, 2023
    Date of Patent: January 9, 2024
    Assignee: GOOGLE LLC
    Inventor: Keun Soo Yim
  • Patent number: 11861309
    Abstract: Example techniques for processing service notes are described. In an example, labeled service notes, associated with fuser units of a plurality of image rendering devices, are processed to generate a vector corresponding to each of the labeled service notes, a labeled service note comprising natural language text describing an error event and a corresponding service activity associated with a fuser unit, wherein the labeled service note is assigned a label based on a category of failure of the fuser unit. Based on the processing, a relationship between vectors and labels corresponding to the labeled service notes is generated.
    Type: Grant
    Filed: November 25, 2019
    Date of Patent: January 2, 2024
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Anton Wiranata, Niranjan Damera Venkata, Prasad Hegde, Aravindakshan B
  • Patent number: 11862173
    Abstract: In an embodiment, an integrated circuit may include one or more CPUs, a memory controller, and a circuit configured to remain powered on when the rest of the SOC is powered down. The circuit may be configured to receive audio samples from a microphone, and match those audio samples against a predetermined pattern to detect a possible command from a user of the device that includes the SOC. In response to detecting the predetermined pattern, the circuit may cause the memory controller to power up so that audio samples may be stored in the memory to which the memory controller is coupled. The circuit may also cause the CPUs to be powered on and initialized, and the operating system (OS) may boot. During the time that the CPUs are initializing and the OS is booting, the circuit and the memory may be capturing the audio samples.
    Type: Grant
    Filed: May 27, 2021
    Date of Patent: January 2, 2024
    Assignee: Apple Inc.
    Inventors: Timothy J. Millet, Manu Gulati, Michael F. Culbert
  • Patent number: 11848004
    Abstract: A method for controlling an electronic device includes obtaining a text, obtaining, by inputting the text into a first neural network model, acoustic feature information corresponding to the text and alignment information in which each frame of the acoustic feature information is matched with each phoneme included in the text, identifying an utterance speed of the acoustic feature information based on the alignment information, identifying a reference utterance speed for each phoneme included in the acoustic feature information based on the text and the acoustic feature information, obtaining utterance speed adjustment information based on the utterance speed of the acoustic feature information and the reference utterance speed for each phoneme, and obtaining, based on the utterance speed adjustment information, speech data corresponding to the text by inputting the acoustic feature information into a second neural network model.
    Type: Grant
    Filed: June 27, 2022
    Date of Patent: December 19, 2023
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Sangjun Park, Kihyun Choo
  • Patent number: 11842724
    Abstract: A method for training a dialogue learning model includes presenting, via a user interface of a computing device, an utterance and a list of actions based on the utterance. A selection of an action from the list of actions is received via the user interface. A designated span of the utterance is received via the user interface. The selected action and the designated span of the utterance is provided to a computing system for training the dialogue learning model.
    Type: Grant
    Filed: December 6, 2021
    Date of Patent: December 12, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Percy Shuo Liang, David Leo Wright Hall, Joshua James Clausman
  • Patent number: 11830475
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network to perform speech synthesis. One of the methods includes obtaining a training data set for training a first neural network to process a spectral representation of an audio sample and to generate a prediction of the audio sample, wherein, after training, the first neural network obtains spectral representations of audio samples from a second neural network; for a plurality of audio samples in the training data set: generating a ground-truth spectral representation of the audio sample; and processing the ground-truth spectral representation using a third neural network to generate an updated spectral representation of the audio sample; and training the first neural network using the updated spectral representations, wherein the third neural network is configured to generate updated spectral representations that resemble spectral representations generated by the second neural network.
    Type: Grant
    Filed: June 1, 2022
    Date of Patent: November 28, 2023
    Assignee: DeepMind Technologies Limited
    Inventor: Norman Casagrande
  • Patent number: 11830476
    Abstract: Devices and techniques are generally described for learned condition text-to-speech synthesis. In some examples, first data representing a selection of a type of prosodic expressivity may be received. In some further examples, a selection of content comprising text data may be received. First audio data may be determined that includes an audio representation of the text data. The first audio data may be generated based at least in part on sampling from a first latent distribution generated using a conditional primary variational autoencoder (VAE). The sampling from the first latent distribution may be conditioned on a first learned distribution associated with the type of prosodic expressivity. In various examples, the first audio data may be sent to a first computing device.
    Type: Grant
    Filed: June 8, 2021
    Date of Patent: November 28, 2023
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: Panagiota Karanasou, Sri Vishnu Kumar Karlapati, Alexis Pierre Moinet, Arnaud Vincent Pierre Yves Joly, Syed Ammar Abbas, Thomas Renaud Drugman, Jaime Lorenzo Trueba
  • Patent number: 11823677
    Abstract: Techniques for interacting with a portion of a content item through a virtual assistant are described herein. The techniques may include identifying a portion of a content item that is relevant to user input and causing an action to be performed related to the portion of the content item. The action may include, for example, displaying the portion of the content item on a smart device in a displayable format that is adapted to a display characteristic of the smart device, performing a task for a user that satisfies the user input, and so on.
    Type: Grant
    Filed: December 13, 2021
    Date of Patent: November 21, 2023
    Assignee: VERINT AMERICAS INC.
    Inventors: Fred A. Brown, Tanya M. Miller
  • Patent number: 11817116
    Abstract: A method and electronic device for processing a waveform are disclosed. The waveform is representative of bodily sounds. The method includes acquiring the waveform from the sound recording component and having a low-frequency component and a high-frequency component, selecting a target moving averaging filter amongst a first moving averaging filter and a second moving averaging filter for filtering the waveform. The first moving averaging filter is to be used for preserving the low-frequency component of the waveform, and the second moving averaging filter is to be used for preserving the high-frequency component of the waveform. The method includes applying the target moving averaging filter on the waveform for reducing noise in the waveform, thereby generating a second waveform.
    Type: Grant
    Filed: December 21, 2022
    Date of Patent: November 14, 2023
    Assignee: SPARROW ACOUSTICS INC.
    Inventors: Yaroslav Shpak, Maksim Davydov
  • Patent number: 11810565
    Abstract: Aspects of the disclosure relate to using machine learning to simulate an interactive voice response system. A computing platform may receive user interaction information corresponding to interactions between a user and enterprise computing devices. Based on the user interaction information, the computing platform may identify predicted intents for the user, and may generate hotkey information based on the predicted intents. The computing platform may send the hotkey information and commands directing the mobile device to output the hotkey information. The computing platform may receive hotkey input information from the mobile device. Based on the hotkey input information, the computing platform may generate a hotkey response message. The computing platform may send, to the mobile device, the hotkey response message and commands directing the mobile device to convert the hotkey response message to an audio output and to output the audio output.
    Type: Grant
    Filed: July 27, 2022
    Date of Patent: November 7, 2023
    Assignee: Bank of America Corporation
    Inventors: Srinivas Dundigalla, Pavan Chayanam, Saurabh Mehta
  • Patent number: 11809885
    Abstract: An agent device receives input information that is input by the user, in a case in which the input information is a question from the user, executes inference processing on the input information to infer an intent of the question in order to acquire a response to the question based on the intent, in a case in which a plurality of the responses are acquired, provides the notification device with option information that includes the plurality of responses as options, in a case in which new input information is received, determines whether the new input information is information requiring the inference processing or is selection information relating to a selection result from selection of the options, and in a case in which the new input information is the selection information, provides the notification device with response information regarding the response associated with the selection result without executing the inference processing.
    Type: Grant
    Filed: January 26, 2021
    Date of Patent: November 7, 2023
    Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventors: Eiichi Maeda, Chikage Kubo, Keiko Nakano, Hiroyuki Nishizawa
  • Patent number: 11797763
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for natural language processing. One of the methods includes receiving a first voice input from a user device; generating a first recognition output; receiving a user selection of one or more terms in the first recognition output; receiving a second voice input spelling a correction of the user selection; determining a corrected recognition output for the selected portion; and providing a second recognition output that merges the first recognition output and the corrected recognition output.
    Type: Grant
    Filed: July 24, 2021
    Date of Patent: October 24, 2023
    Assignee: Google LLC
    Inventors: Evgeny A. Cherepanov, Gleb Skobeltsyn, Jakob Nicolaus Foerster, Petar Aleksic, Assaf Avner Hurwitz Michaely
  • Patent number: 11798527
    Abstract: The present disclosure discloses a method for synthesizing a speech. The method includes generating the speech based on a text with a speech synthesis model, wherein the speech synthesis model includes an embedding layer, a speech synthesis layer, and a position layer; and training the speech synthesis model when an evaluation index meets a preset condition, wherein the evaluation index includes one or more quality indexes determined based on at least a part of the text and at least a part of the speech.
    Type: Grant
    Filed: August 18, 2021
    Date of Patent: October 24, 2023
    Assignee: ZHEJIANG TONGHU ASHUN INTELLIGENT TECHNOLOGY CO., LTD.
    Inventors: Peng Zhang, Xinhui Hu, Xinkang Xu, Jian Lu