Patents Examined by Jakieda R Jackson

Systems and methods to alter voice interactions

Patent number: 11984112

Abstract: Systems and methods are disclosed for providing voice interactions based on user context. Data is received that causes a voice interaction to be generated for output at a user device. In response, current user contextual data of the user device is retrieved. A user availability level for consuming the voice interaction is determined based on the current user contextual data. The voice interaction is altered based on the user availability level. Content of the voice interaction may be altered to be suitable for consumption. The altered voice interaction is outputted at the user device.

Type: Grant

Filed: April 29, 2021

Date of Patent: May 14, 2024

Assignee: Rovi Guides, Inc.

Inventors: Ankur Anil Aher, Jeffry Copps Robert Jose, Reda Harb
Target speaker separation system, device and storage medium

Patent number: 11978470

Abstract: Disclosed are a target speaker separation system, an electronic device and a storage medium. The system includes: first, performing, jointly unified modeling on a plurality of cues based a masked pre-training strategy, to boost the inference capability of a model for missing cues and enhance the representation accuracy of disturbed cues; and second, constructing a hierarchical cue modulation module. A spatial cue is introduced into a primary cue modulation module for directional enhancement of a speech of a speaker; in an intermediate cue modulation module, the speech of the speaker is enhanced on the basis of temporal coherence of a dynamic cue and an auditory signal component; a steady-state cue is introduced into an advanced cue modulation module for selective filtering; and finally, the supervised learning capability of simulation data and the unsupervised learning effect of real mixed data are sufficiently utilized.

Type: Grant

Filed: November 3, 2022

Date of Patent: May 7, 2024

Assignee: INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES

Inventors: Jiaming Xu, Jian Cui, Bo Xu
Method for operating a hearing device based on a speech signal, and hearing device

Patent number: 11967334

Abstract: A method for operating a hearing device on the basis of a speech signal. An acousto-electric input transducer of the hearing device records a sound containing the speech signal from surroundings of the hearing device and converts the sound into an input audio signal. A signal processing operation generates an output audio signal based on the input audio signal. At least one articulatory and/or prosodic feature of the speech signal is quantitatively acquired through analysis of the input audio signal by way of the signal processing operation, and a quantitative measure of a speech quality of the speech signal is derived on the basis of the property. At least one parameter of the signal processing operation for generating the output audio signal based on the input audio signal is set on the basis of the quantitative measure of the speech quality of the speech signal.

Type: Grant

Filed: August 30, 2021

Date of Patent: April 23, 2024

Assignee: Sivantos Pte. Ltd.

Inventors: Sebastian Best, Marko Lugger
System for voice control of a medical implant

Patent number: 11957923

Abstract: A system for communicating with a medical implant in a body of a user is provided. The system comprises a command generator and a memory, wherein the command generator is configured to receive a voice command from the user and to reduce noise, generated by the body, from the received voice command to generate a noise-reduced voice command. The command generator is further configured to compare the noise-reduced voice command with previously stored voice commands and to transmit a command signal to the implant in case the noise-reduced voice command corresponds to any of the previously received commands stored in the memory.

Type: Grant

Filed: April 6, 2021

Date of Patent: April 16, 2024

Inventor: Peter Forsell
Virtual assistant for media playback

Patent number: 11947873

Abstract: An exemplary method for identifying media may include receiving user input associated with a request for media, where that user input includes unstructured natural language speech including one or more words; identifying at least one context associated with the user input; causing a search for the media based on the at least one context and the user input; determining, based on the at least one context and the user input, at least one media item that satisfies the request; and in accordance with a determination that the at least one media item satisfies the request, obtaining the at least one media item.

Type: Grant

Filed: April 9, 2021

Date of Patent: April 2, 2024

Assignee: Apple Inc.

Inventors: Ryan M. Orr, Daniel J. Mandel, Andrew J. Sinesio, Connor J. Barnett
Speech processing method, apparatus, electronic device, and computer-readable storage medium

Patent number: 11948552

Abstract: A speech processing method, performed by an electronic device, includes determining a first speech feature and a first text bottleneck feature based on to-be-processed speech information, determining a first combined feature vector based on the first speech feature and the first text bottleneck feature, inputting the first combined feature vector to a trained unidirectional long short-term memory (LSTM) model, performing speech processing on the first combined feature vector to obtain speech information after noise reduction, and transmitting the obtained speech information after noise reduction to another electronic device for playing.

Type: Grant

Filed: August 30, 2021

Date of Patent: April 2, 2024

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Yan Nan Wang, Jun Huang
Electronic device and operating method thereof

Patent number: 11942077

Abstract: An electronic device for providing a text-to-speech (TTS) service and an operating method therefor are provided. The operating method of the electronic device includes obtaining target voice data based on an utterance input of a specific speaker, determining a number of learning steps of the target voice data, based on data features including a data amount of the target voice data, generating a target model by training a pre-trained model pre-trained to convert text into an audio signal, by using the target voice data as training data, based on the determined number of learning steps, generating output data obtained by converting input text into an audio signal, by using the generated target model, and outputting the generated output data.

Type: Grant

Filed: September 21, 2022

Date of Patent: March 26, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Kyoungbo Min, Seungdo Choi, Doohwa Hong
Intelligent integrated remote reporting system

Patent number: 11934974

Abstract: Systems, methods, and apparatus are provided for intelligent, integrated, and interactive remote reporting. A remote natural language request for a report may be received from a user at an edge device. A first machine learning model may generate a list of existing reports based on past usage by the user. If no existing report satisfies the request, a second, enterprise-level machine learning model may map the request to relevant data sets and rank the mapped data sets along with additional related data sets based on enterprise-wide usage. An integrated reporting platform may receive selected data sets and report parameters as a JSON request, convert the request to compatible executable instructions, and generate the report. The integrated reporting platform may be a wrapper layer encompassing multiple proprietary reporting engines. Feedback from the integrated reporting platform may be applied to update the machine learning models.

Type: Grant

Filed: October 5, 2021

Date of Patent: March 19, 2024

Assignee: Bank of America Corporation

Inventors: Gaurav Bansal, Nikhil Pathak, Raja Venkatesh Gottumukkala
End-to-end spoken language understanding without full transcripts

Patent number: 11929062

Abstract: A method and system of training a spoken language understanding (SLU) model includes receiving natural language training data comprising (i) one or more speech recording, and (ii) a set of semantic entities and/or intents for each corresponding speech recording. For each speech recording, one or more entity labels and corresponding values, and one or more intent labels are extracted from the corresponding semantic entities and/or overall intent. A spoken language understanding (SLU) model is trained based upon the one or more entity labels and corresponding values, and one or more intent labels of the corresponding speech recordings without a need for a transcript of the corresponding speech recording.

Type: Grant

Filed: September 15, 2020

Date of Patent: March 12, 2024

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Hong-Kwang Jeff Kuo, Zoltan Tueske, Samuel Thomas, Yinghui Huang, Brian E. D. Kingsbury, Kartik Audhkhasi
Presentation support system

Patent number: 11922929

Abstract: To provide a presentation assistance system capable of dynamically changing presentation materials according to terms used in a presentation. A presentation assistance system comprises: a presentation material storage unit 3; a related word storage unit 5 which stores a plurality of related words related to the presentation material; a succeeding information storage unit 7 which stores, for each of the related words, information about a succeeding related word that is one or a plurality of related words that are preferably used next; a related word analysis unit 9 which analyzes which one of the related words corresponds to the word analyzed by a term analysis unit; and a succeeding related word selecting unit 11 which selects a succeeding related word from the succeeding information storage unit, by using information about an analyzed related word which is a related word analyzed by the related word analysis unit.

Type: Grant

Filed: January 6, 2020

Date of Patent: March 5, 2024

Assignee: Interactive Solutions Corp.

Inventor: Kiyoshi Sekine
Generating audio using auto-regressive generative neural networks

Patent number: 11915689

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a prediction of an audio signal. One of the methods includes receiving a request to generate an audio signal conditioned on an input; processing the input using an embedding neural network to map the input to one or more embedding tokens; generating a semantic representation of the audio signal; generating, using one or more generative neural networks and conditioned on at least the semantic representation and the embedding tokens, an acoustic representation of the audio signal; and processing at least the acoustic representation using a decoder neural network to generate the prediction of the audio signal.

Type: Grant

Filed: September 7, 2023

Date of Patent: February 27, 2024

Assignee: Google LLC

Inventors: Andrea Agostinelli, Timo Immanuel Denk, Antoine Caillon, Neil Zeghidour, Jesse Engel, Mauro Verzetti, Christian Frank, Zalán Borsos, Matthew Sharifi, Adam Joseph Roberts, Marco Tagliasacchi
Semi-sorted batching with variable length input for efficient training

Patent number: 11915685

Abstract: Techniques are described for training neural networks on variable length datasets. The numeric representation of the length of each training sample is randomly perturbed to yield a pseudo-length, and the samples sorted by pseudo-length to achieve lower zero padding rate (ZPR) than completely randomized batching (thus saving computation time) yet higher randomness than strictly sorted batching (thus achieving better model performance than strictly sorted batching).

Type: Grant

Filed: March 23, 2023

Date of Patent: February 27, 2024

Assignee: Sony Interactive Entertainment Inc.

Inventors: Zhenhao Ge, Lakshmish Kaushik, Saket Kumar, Masanori Omote
Dialog management for multiple users

Patent number: 11908468

Abstract: A system that is capable of resolving anaphora using timing data received by a local device. A local device outputs audio representing a list of entries. The audio may represent synthesized speech of the list of entries. A user can interrupt the device to select an entry in the list, such as by saying “that one.” The local device can determine an offset time representing the time between when audio playback began and when the user interrupted. The local device sends the offset time and audio data representing the utterance to a speech processing system which can then use the offset time and stored data to identify which entry on the list was most recently output by the local device when the user interrupted. The system can then resolve anaphora to match that entry and can perform additional processing based on the referred to item.

Type: Grant

Filed: December 4, 2020

Date of Patent: February 20, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Prakash Krishnan, Arindam Mandal, Siddhartha Reddy Jonnalagadda, Nikko Strom, Ariya Rastrow, Ying Shi, David Chi-Wai Tang, Nishtha Gupta, Aaron Challenner, Bonan Zheng, Angeliki Metallinou, Vincent Auvray, Minmin Shen
Audio and video translator

Patent number: 11908449

Abstract: A system and method for translating audio, and video when desired. The translations include synthetic media and data generated using AI systems. Through unique processors and generators executing a unique sequence of steps, the system and method produces more accurate translations that can account for various speech characteristics (e.g., emotion, pacing, idioms, sarcasm, jokes, tone, phonemes, etc.). These speech characteristics are identified in the input media and synthetically incorporated into the translated outputs to mirror the characteristics in the input media. Some embodiments further include systems and methods that manipulate the input video such that the speakers' faces and/or lips appear as if they are natively speaking the generated audio.

Type: Grant

Filed: November 29, 2022

Date of Patent: February 20, 2024

Assignee: Deep Media Inc.

Inventors: Rijul Gupta, Emma Brown
Method and apparatus for sequence labeling on entity text, and non-transitory computer-readable recording medium

Patent number: 11907661

Abstract: A method and an apparatus for sequence labeling on an entity text, and a non-transitory computer-readable recording medium are provided. In the method, a start position of an entity text within a target text is determined. Then, a first matrix is generated based on the start position of the entity text. Elements in the first matrix indicates focusable weights of each word with respect to other words in the target text. Then, a named entity recognition model is generated using the first matrix. The named entity recognition model is obtained by training using first training data, the first training data includes word embeddings corresponding to respective texts in a training text set, and the texts are texts whose entity label has been labeled. Then, the target text is input to the named entity recognition model, and probability distribution of the entity label is output.

Type: Grant

Filed: November 22, 2021

Date of Patent: February 20, 2024

Assignee: Ricoh Company, Ltd.

Inventors: Yixuan Tong, Yongwei Zhang, Bin Dong, Shanshan Jiang, Jiashi Zhang
Natural language processing using context

Patent number: 11908480

Abstract: This disclosure proposes systems and methods for processing natural language inputs using data associated with multiple language recognition contexts (LRC). A system using multiple LRCs can receive input data from a device, identify a first identifier associated with the device, and further identify second identifiers associated with the first identifier and representing candidate users of the device. The system can access language processing data used for natural language processing for the LRCs corresponding to each of the first and second identifiers, and process the input data using the language processing data at one or more stages of automatic speech recognition, natural language understanding, entity resolution, and/or command execution. User recognition can reduce the number of candidate users, and thus the amount of data used to process the input data. Dynamic arbitration can select from between competing hypotheses representing the first identifier and a second identifier, respectively.

Type: Grant

Filed: March 23, 2020

Date of Patent: February 20, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Da Teng, Adrian Evans, Naresh Narayanan
Conversation-aware proactive notifications for a voice interface device

Patent number: 11908445

Abstract: A method for proactive notifications in a voice interface device includes: receiving a first user voice request for an action with an future performance time; assigning the first user voice request to a voice assistant service for performance; subsequent to the receiving, receiving a second user voice request and in response to the second user voice request initiating a conversation with the user; and during the conversation: receiving a notification from the voice assistant service of performance of the action; triggering a first audible announcement to the user to indicate a transition from the conversation and interrupting the conversation; triggering a second audible announcement to the user to indicate performance of the action; and triggering a third audible announcement to the user to indicate a transition back to the conversation and rejoining the conversation.

Type: Grant

Filed: May 16, 2022

Date of Patent: February 20, 2024

Assignee: Google LLC

Inventors: Kenneth Mixter, Daniel Colish, Tuan Nguyen
Signal extraction system, signal extraction learning method, and signal extraction learning program

Patent number: 11900949

Abstract: A neural network input unit 81 inputs a neural network in which a first network having a layer for inputting an anchor signal belonging to a predetermined class and a mixed signal including a target signal belonging to the class and a layer for outputting, as an estimation result, a reconstruction mask indicating a time-frequency domain in which the target signal is present in the mixed signal, and a second network having a layer for inputting the target signal extracted by applying the mixed signal to the reconstruction mask and a layer for outputting a result obtained by classifying the input target signal into a predetermined class are combined. A reconstruction mask estimation unit 82 applies the anchor signal and mixed signal to the first network to estimate the reconstruction mask of the class to which the anchor signal belongs.

Type: Grant

Filed: May 28, 2019

Date of Patent: February 13, 2024

Assignee: NEC CORPORATION

Inventors: Takafumi Koshinaka, Hitoshi Yamamoto, Kaoru Koida, Takayuki Suzuki
Digitally controlled oscillator for a synthesizer module, synthesizer module, synthesizer, and method for producing an electrical audio signal

Patent number: 11901904

Abstract: A digitally controlled oscillator (100), a synthesizer module (200), a synthesizer (300), and a method for producing an electrical audio signal are presented. The oscillator (100) comprises a digital processing unit (10) configured to generate a first pulse wave at a first output (PulseUp) of the processing unit (10), wherein the first pulse wave is arranged to include pulses at at least two different frequencies. The oscillator (100) further comprises a summing circuit (30) and a linear wave shaper (20). The output (PulseUp) of the processing unit (10) is connected to the summing circuit (30) which is arranged to produce a resultant signal based on at least the first pulse wave. The resultant signal is arranged to be fed into the linear wave shaper (20) which is arranged to produce an output signal at the output (OUT) of the oscillator (100) based on modifying the resultant signal.

Type: Grant

Filed: March 18, 2020

Date of Patent: February 13, 2024

Assignee: SUPERCRITICAL OY

Inventor: Timo Alho
Enabling sampling rate diversity in a voice communication system

Patent number: 11894005

Abstract: An audio communication endpoint receives a bitstream containing spectral components representing spectral content of an audio signal, wherein the spectral components relate to a first range extending up to a first break frequency, above which any spectral components are unassigned. The endpoint adapts the received bitstream in accordance with a second range extending up to a second break frequency by removing spectral components or adding neutral-valued spectral components relating to a range between the first and second break frequencies. The endpoint then attenuates spectral content in a neighbourhood of the least of the first and second break frequencies for thereby achieving a gradual spectral decay. After this, reconstructing the audio signal is reconstructed by an inverse transform operating on spectral components relating to said second range in the adapted and attenuated received bitstream. At small computational expense, the endpoint may to adapt to different sample rates in received bitstreams.

Type: Grant

Filed: November 15, 2019

Date of Patent: February 6, 2024

Assignees: DOLBY LABORATORIES LICENSING CORPORATION, DOLBY INTERNATIONAL AB

Inventors: Heiko Purnhagen, Leif Sehlstrom, Lars Villemoes, Glenn N. Dickins, Mark S Vinton

1 2 3 4 5 … next