Patents Examined by Jakieda R Jackson
  • Patent number: 11900949
    Abstract: A neural network input unit 81 inputs a neural network in which a first network having a layer for inputting an anchor signal belonging to a predetermined class and a mixed signal including a target signal belonging to the class and a layer for outputting, as an estimation result, a reconstruction mask indicating a time-frequency domain in which the target signal is present in the mixed signal, and a second network having a layer for inputting the target signal extracted by applying the mixed signal to the reconstruction mask and a layer for outputting a result obtained by classifying the input target signal into a predetermined class are combined. A reconstruction mask estimation unit 82 applies the anchor signal and mixed signal to the first network to estimate the reconstruction mask of the class to which the anchor signal belongs.
    Type: Grant
    Filed: May 28, 2019
    Date of Patent: February 13, 2024
    Assignee: NEC CORPORATION
    Inventors: Takafumi Koshinaka, Hitoshi Yamamoto, Kaoru Koida, Takayuki Suzuki
  • Patent number: 11901904
    Abstract: A digitally controlled oscillator (100), a synthesizer module (200), a synthesizer (300), and a method for producing an electrical audio signal are presented. The oscillator (100) comprises a digital processing unit (10) configured to generate a first pulse wave at a first output (PulseUp) of the processing unit (10), wherein the first pulse wave is arranged to include pulses at at least two different frequencies. The oscillator (100) further comprises a summing circuit (30) and a linear wave shaper (20). The output (PulseUp) of the processing unit (10) is connected to the summing circuit (30) which is arranged to produce a resultant signal based on at least the first pulse wave. The resultant signal is arranged to be fed into the linear wave shaper (20) which is arranged to produce an output signal at the output (OUT) of the oscillator (100) based on modifying the resultant signal.
    Type: Grant
    Filed: March 18, 2020
    Date of Patent: February 13, 2024
    Assignee: SUPERCRITICAL OY
    Inventor: Timo Alho
  • Patent number: 11894005
    Abstract: An audio communication endpoint receives a bitstream containing spectral components representing spectral content of an audio signal, wherein the spectral components relate to a first range extending up to a first break frequency, above which any spectral components are unassigned. The endpoint adapts the received bitstream in accordance with a second range extending up to a second break frequency by removing spectral components or adding neutral-valued spectral components relating to a range between the first and second break frequencies. The endpoint then attenuates spectral content in a neighbourhood of the least of the first and second break frequencies for thereby achieving a gradual spectral decay. After this, reconstructing the audio signal is reconstructed by an inverse transform operating on spectral components relating to said second range in the adapted and attenuated received bitstream. At small computational expense, the endpoint may to adapt to different sample rates in received bitstreams.
    Type: Grant
    Filed: November 15, 2019
    Date of Patent: February 6, 2024
    Assignees: DOLBY LABORATORIES LICENSING CORPORATION, DOLBY INTERNATIONAL AB
    Inventors: Heiko Purnhagen, Leif Sehlstrom, Lars Villemoes, Glenn N. Dickins, Mark S Vinton
  • Patent number: 11889566
    Abstract: Described herein are various embodiments for customized device pairing based on device features. An embodiment operates by receiving, at a first device, a pairing message from a second device requesting a pairing between the second device and the first device. User-accessible features of the second device that are accessible for the pairing, are determined, and a sequence of actions for the pairing are generated based on the user-accessible features of the second device. The sequence of actions are provided for display on a display device communicatively coupled to the first device and independent of the second device. Indicia indicating which actions were performed with respect to the second device are received. The indicia is compared to the displayed sequence of actions. The first device is paired with the second device based on a determination that the indicia corresponds to the displayed sequences of actions.
    Type: Grant
    Filed: November 16, 2021
    Date of Patent: January 30, 2024
    Assignee: Roku, Inc.
    Inventor: Carl Sassenrath
  • Patent number: 11887586
    Abstract: A method includes retrieving a plurality of transcripts from a database. Each transcript in the plurality of transcripts corresponds to audio from a media content item of a plurality of media content items that are provided by a media providing service. The method also includes applying each transcript of the plurality of transcripts to a trained computational model, and receiving a user request for information regarding a topic. The method further includes, in response to the user request, identifying a transcript from the database that is relevant to the topic, and a position within the transcript that is relevant to the topic. The method also includes providing, by the media providing service, at least a portion of a media content item corresponding to the identified transcript, beginning at a starting position that is based on the position within the identified transcript that is relevant to the topic.
    Type: Grant
    Filed: March 3, 2021
    Date of Patent: January 30, 2024
    Assignee: Spotify AB
    Inventors: Vidhya Murali, Aaron Paul Harmon
  • Patent number: 11887581
    Abstract: An audio playback system that provides intuitive audio playback of textual content responsive to user input actions, such as scrolling portions of textual content on a display. Playback of audio (e.g., text-to-speech audio) that includes textual content can begin based on a portion of textual content being positioned by a user input at a certain position on a device display. As one example, a user can simply scroll through a webpage or other content item to cause a text-to-speech system to perform audio playback of textual content displayed in one or more playback section(s) of the device's viewport (e.g., rather than requiring the user to perform additional tapping or gesturing to specifically select a certain portion of textual content).
    Type: Grant
    Filed: November 14, 2019
    Date of Patent: January 30, 2024
    Assignee: GOOGLE LLC
    Inventors: Rachel Ilan Simpson, Benedict Davies, Guillaume Boniface-Chang
  • Patent number: 11887606
    Abstract: Provided are a method and device for recognizing a speaker by using a resonator. The method of recognizing the speaker includes receiving a plurality of electrical signals corresponding to a speech of the speaker from a plurality of resonators having different resonance bands; obtaining a difference of magnitudes of the plurality of electrical signals; and recognizing the speaker based on the difference of magnitudes of the plurality of electrical signals.
    Type: Grant
    Filed: May 10, 2022
    Date of Patent: January 30, 2024
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Cheheung Kim, Sungchan Kang, Sangha Park, Yongseop Yoon, Choongho Rhee
  • Patent number: 11881223
    Abstract: Systems and methods for managing multiple voice assistants are disclosed. Audio input is received via one or more microphones of a playback device. A first activation word is detected in the audio input via the playback device. After detecting the first activation word, the playback device transmits a voice utterance of the audio input to a first voice assistant service (VAS). The playback device receives, from the first VAS, first content to be played back via the playback device. The playback device also receives, from a second VAS, second content to be played back via the playback device. The playback device plays back the first content while suppressing the second content. Such suppression can include delaying or canceling playback of the second content.
    Type: Grant
    Filed: December 5, 2022
    Date of Patent: January 23, 2024
    Assignee: Sonos, Inc.
    Inventors: Ryan Richard Myers, Luis R. Vega Zayas, Sangah Park
  • Patent number: 11875087
    Abstract: Aspects of the disclosure relate to generating outputs using a digital personal assistant computing control platform and machine learning. A computing platform may receive, from a digital personal assistant computing device, a first voice command input. The computing platform may then determine, via machine learning algorithms, an identifier output indicating a user associated with the first voice command input and a location output indicating a geographic location associated with the user. The computing platform may determine, via a stored calendar, an availability output indicating availability associated with the user. Based on the identifier output, the location output, and the availability output, a charitable opportunity output indicating a charitable opportunity may be determined by the computing platform and may be transmitted to a computing device associated with the charitable opportunity.
    Type: Grant
    Filed: February 20, 2023
    Date of Patent: January 16, 2024
    Assignee: Allstate Insurance Company
    Inventors: Elizabeth C. Schreier, Jamie E. Grahn
  • Patent number: 11869261
    Abstract: Audio distortion compensation methods to improve accuracy and efficiency of audio content identification are described. The method is also applicable to speech recognition. Methods to detect the interference from speakers and sources, and distortion to audio from environment and devices, are discussed. Additional methods to detect distortion to the content after performing search and correlation are illustrated. The causes of actual distortion at each client are measured and registered and learnt to generate rules for determining likely distortion and interference sources. The learnt rules are applied at the client, and likely distortions that are detected are compensated or heavily distorted sections are ignored at audio level or signature and feature level based on compute resources available. Further methods to subtract the likely distortions in the query at both audio level and after processing at signature and feature level are described.
    Type: Grant
    Filed: February 22, 2023
    Date of Patent: January 9, 2024
    Assignee: Roku, Inc.
    Inventors: Jose Pio Pereira, Sunil Suresh Kulkarni, Mihailo M. Stojancic, Shashank Merchant, Peter Wendt
  • Patent number: 11868738
    Abstract: The present disclosure describes methods, devices, and storage medium for generating a natural language description for a media object. The method includes respectively processing, by a device, a media object by using a plurality of natural language description models to obtain a plurality of first feature vectors corresponding to a plurality of feature types. The device includes a memory storing instructions and a processor in communication with the memory. The method also includes fusing, by the device, the plurality of first feature to obtain a second feature vector; and generating, by the device, a natural language description for the media object according to the second feature vector, the natural language description being used for expressing the media object in natural language. The present disclosure resolves the technical problem that natural language description generated for a media object can only give an insufficiently accurate description of the media object.
    Type: Grant
    Filed: February 23, 2021
    Date of Patent: January 9, 2024
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Bairui Wang, Lin Ma, Yang Feng, Wei Liu
  • Patent number: 11854533
    Abstract: Techniques disclosed herein enable training and/or utilizing speaker dependent (SD) speech models which are personalizable to any user of a client device. Various implementations include personalizing a SD speech model for a target user by processing, using the SD speech model, a speaker embedding corresponding to the target user along with an instance of audio data. The SD speech model can be personalized for an additional target user by processing, using the SD speech model, an additional speaker embedding, corresponding to the additional target user, along with another instance of audio data. Additional or alternative implementations include training the SD speech model based on a speaker independent speech model using teacher student learning.
    Type: Grant
    Filed: January 28, 2022
    Date of Patent: December 26, 2023
    Assignee: GOOGLE LLC
    Inventors: Ignacio Lopez Moreno, Quan Wang, Jason Pelecanos, Li Wan, Alexander Gruenstein, Hakan Erdogan
  • Patent number: 11854548
    Abstract: Systems and techniques for adaptive conversation support bot are described herein. An audio stream may be obtained including a conversation of a first user. An event may be identified in the conversation using the audio stream. A first keyword phrase may be extracted from the audio stream in response to identification of the event. The audio stream may be searched for a second keyword phrase based on the first keyword phrase. An action may be performed based on the first keyword phrase and the second keyword phrase. Results of the action may be out via a context appropriate output channel. The context appropriate output channel may be determined based on a context of the conversation and a privacy setting of the first user.
    Type: Grant
    Filed: November 22, 2022
    Date of Patent: December 26, 2023
    Assignee: Wells Fargo Bank, N.A.
    Inventor: Vincent Le Chevalier
  • Patent number: 11853648
    Abstract: Systems and methods for smart sensors are provided. A smart sensor includes: a case; a power adapter configured to be plugged directly into an electrical outlet; a computer processor; a microphone; a speaker; a camera; at least one sensor; a control switch; a sync button; a USB port; and a memory storing: an operating system; a voice control module; a peer interaction module; a remote interaction module; and a cognitive module. In embodiments, the power adapter includes prongs that extend from a back side of the case, and the microphone, the speaker, the camera, and the at least one sensor are on a front side of the case opposite the back side of the case.
    Type: Grant
    Filed: April 16, 2021
    Date of Patent: December 26, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Stan K. Daley, Zhong-Hui Lin, Tao Liu, Dean Phillips, Kent R. VanOoyen
  • Patent number: 11854528
    Abstract: An apparatus for detecting unsupported utterances in natural language understanding, includes a memory storing instructions, and at least one processor configured to execute the instructions to classify a feature that is extracted from an input utterance of a user, as one of in-domain and out-of-domain (OOD) for a response to the input utterance, obtain an OOD score of the extracted feature, and identify whether the feature is classified as OOD. The at least one processor is further configured to executed the instructions to, based on the feature being identified to be classified as in-domain, identify whether the obtained OOD score is greater than a predefined threshold, and based on the OOD score being identified to be greater than the predefined threshold, re-classify the feature as OOD.
    Type: Grant
    Filed: August 13, 2021
    Date of Patent: December 26, 2023
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Yen-Chang Hsu, Yilin Shen, Avik Ray, Hongxia Jin
  • Patent number: 11842725
    Abstract: A method of own voice detection is provided for a user of a device. A first signal is detected, representing air-conducted speech using a first microphone of the device. A second signal is detected, representing bone-conducted speech using a bone-conduction sensor of the device. The first signal is filtered to obtain a component of the first signal at a speech articulation rate, and the second signal is filtered to obtain a component of the second signal at the speech articulation rate. The component of the first signal at the speech articulation rate and the component of the second signal at the speech articulation rate are compared, and it is determined that the speech has not been generated by the user of the device, if a difference between the component of the first signal at the speech articulation rate and the component of the second signal at the speech articulation rate exceeds a threshold value.
    Type: Grant
    Filed: September 13, 2022
    Date of Patent: December 12, 2023
    Assignee: Cirrus Logic Inc.
    Inventor: John Paul Lesso
  • Patent number: 11842729
    Abstract: In one implementation, a method of generating CGR content to accompany an audio file including audio data and lyric data based on semantic analysis of the audio data and the lyric data is performed by a device including a processor, non-transitory memory, a speaker, and a display. The method includes obtaining an audio file including audio data and lyric data associated with the audio data. The method includes performing natural language analysis of at least a portion of the lyric data to determine a plurality of candidate meanings of the portion of the lyric data. The method includes performing semantic analysis of the portion of the lyric data to determine a meaning of the portion of the lyric data by selecting, based on a corresponding portion of the audio data, one of the plurality of candidate meanings as the meaning of the portion of the lyric data. The method includes generating CGR content associated with the portion of the lyric data based on the meaning of the portion of the lyric data.
    Type: Grant
    Filed: May 6, 2020
    Date of Patent: December 12, 2023
    Assignee: APPLE INC.
    Inventor: Ian M. Richter
  • Patent number: 11837216
    Abstract: A method for training a generative adversarial network (GAN)-based text-to-speech (TTS) model and a speech recognition model in unison includes obtaining a plurality of training text utterances. At each of a plurality of output steps for each training text utterance, the method also includes generating, for output by the GAN-Based TTS model, a synthetic speech representation of the corresponding training text utterance, and determining, using an adversarial discriminator of the GAN, an adversarial loss term indicative of an amount of acoustic noise disparity in one of the non-synthetic speech representations selected from the set of spoken training utterances relative to the corresponding synthetic speech representation of the corresponding training text utterance. The method also includes updating parameters of the GAN-based TTS model based on the adversarial loss term determined at each of the plurality of output steps for each training text utterance of the plurality of training text utterances.
    Type: Grant
    Filed: February 14, 2023
    Date of Patent: December 5, 2023
    Assignee: Google LLC
    Inventors: Zhehuai Chen, Andrew M. Rosenberg, Bhuvana Ramabhadran, Pedro J. Moreno Mengibar
  • Patent number: 11830472
    Abstract: Developers can configure custom acoustic models by providing audio files with custom recordings. The custom acoustic model is trained by tuning a baseline model using the audio files. Audio files may contain custom noise to apply to clean speech for training. The custom acoustic model is provided as an alternative to a standard acoustic model. Device developers can select an acoustic model by a user interface. Speech recognition is performed on speech audio using one or more acoustic models. The result can be provided to developers through the user interface, and an error rate can be computed and also provided.
    Type: Grant
    Filed: January 11, 2022
    Date of Patent: November 28, 2023
    Assignee: SOUNDHOUND AI IP, LLC
    Inventors: Keyvan Mohajer, Mehul Patel
  • Patent number: 11816440
    Abstract: The disclosed embodiments describe methods, systems, and apparatuses for determining user intent. A method is disclosed comprising obtaining a session text of a user; calculating, by the processor, a feature vector based on the session text; determining probabilities that the session text belongs to a plurality of intent labels, the probabilities calculated using a multi-level hierarchal intent classification model, the intent labels assigned to levels in the multi-level hierarchal intent classification model; and assigning a user intent to the session text based on the probabilities.
    Type: Grant
    Filed: October 21, 2022
    Date of Patent: November 14, 2023
    Assignee: ALIBABA GROUP HOLDING LIMITED
    Inventors: Ling Li, Zhiwei Shi, Yanjie Liang