Sound Editing, Manipulating Voice Of The Synthesizer (epo) Patents (Class 704/E13.004)

System and/or method for semantic parsing of air traffic control audio

Patent number: 11967324

Abstract: The method S200 can include: at an aircraft, receiving an audio utterance from air traffic control S210, converting the audio utterance to text, determining commands from the text using a question-and-answer model S240, and optionally controlling the aircraft based on the commands S250. The method functions to automatically interpret flight commands from the air traffic control (ATC) stream.

Type: Grant

Filed: October 28, 2022

Date of Patent: April 23, 2024

Assignee: Merlin Labs, Inc.

Inventors: Michael Pust, Joseph Bondaryk, Matthew George
Method and system for content internationalization and localization

Patent number: 11948555

Abstract: A method of processing a video file to generate a modified video file, the modified video file including a translated audio content of the video file, the method comprising: receiving the video file; accessing a facial model or a speech model for a specific speaker, wherein the facial model maps speech to facial expressions, and the speech model maps text to speech; receiving a reference content for the originating video file for the specific speaker; generating modified audio content for the specific speaker and/or modified facial expression for the specific speaker; and modifying the video file in accordance with the modified content and/or the modified expression to generate the modified video file.

Type: Grant

Filed: March 20, 2020

Date of Patent: April 2, 2024

Assignee: NEP SUPERSHOOTERS L.P.

Inventors: Mark Christie, Gerald Chao
Information processing method and information processing system for sound synthesis utilizing identification data associated with sound source and performance styles

Patent number: 11942071

Abstract: An information processing system includes at least one memory storing a program and at least one processor. The at least one processor implements the program to input a piece of sound source data obtained by encoding a first identification data representative of a sound source, a piece of style data obtained by encoding a second identification data representative of a performance style, and synthesis data representative of sounding conditions into a synthesis model generated by machine learning, and to generate, using the synthesis model, feature data representative of acoustic features of a target sound of the sound source to be generated in the performance style and according to the sounding conditions, and to generate an audio signal corresponding to the target sound using the generated feature data.

Type: Grant

Filed: May 4, 2021

Date of Patent: March 26, 2024

Assignee: YAMAHA CORPORATION

Inventors: Ryunosuke Daido, Merlijn Blaauw, Jordi Bonada
Natural language processing

Patent number: 11922942

Abstract: Devices and techniques are generally described for generating response templates for natural language processing. In various examples, a first knowledge graph comprising a plurality of entities may be received. First text data may be received for a first response template, the first text data defining a natural language input configured to invoke the first response template. A response definition may be received for the first response template, the response definition defining a response associated with the first response template. Natural language input data may be received. A determination may be made that the natural language input data corresponds to the natural language input configured to invoke the first response template. The first response template may be configured to generate natural language output data.

Type: Grant

Filed: June 4, 2020

Date of Patent: March 5, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Emre Can Kilinc, Thomas Reno, John Zucchi, Joshua Kaplan
Interactive virtual assistant system

Patent number: 11803399

Abstract: A method, computer program product, and computer system for defining, at a computing device, psychometric data for a user. An interactive virtual assistant, selected from a plurality of interactive virtual assistants, may be provided on the computing device based upon, at least in part, the psychometric data defined for the user. The user may be prompted, via the interactive virtual assistant, with one or more options.

Type: Grant

Filed: May 31, 2020

Date of Patent: October 31, 2023

Assignee: Happy Money, Inc.

Inventors: Adam Zarlengo, Chris Courtney, Michael Tepper, Josh Hemsley, Ryan Howes, Daniel Sinner, Scott Saunders
Systems and methods for integrating voice controls into applications

Patent number: 11798542

Abstract: The disclosed computer-implemented method may include receiving input voice data synchronous with a visual state of a user interface of the third-party application, generating multiple sentence alternatives for the received input voice data, identifying a best sentence of the multiple sentence alternatives, executing a dialog script for the third-party application using the best sentence, the dialog script generating a response to the received voice data comprising output voice data and a corresponding visual response, and providing the visual response and the output voice data to the third-party application, the third-party application playing the output voice data synchronous with updating the user interface based on the visual response. Various other methods, systems, and computer-readable media are also disclosed.

Type: Grant

Filed: March 26, 2021

Date of Patent: October 24, 2023

Assignee: Alan AI, Inc.

Inventors: Andrey Ryabov, Ramu V. Sunkara
Natural language processing system for context-specific applier interface

Patent number: 11776537

Abstract: A computer-implemented method is provided to optimize natural language processing of voice interaction data in product/service categorization and product/service application. The computer-implemented method receives, from a voice interaction device through a context discovery interface, user voice data corresponding to a user. Furthermore, the computer-implemented method performs, with an NLP engine, natural language processing of the user voice data to determine a context category. Additionally, the computer-implemented method selects, with an AI engine, one of a plurality of context-specific applier interfaces based on the context category. The computer-implemented method automatically transitions, with the AI engine, to said one of the plurality of context-specific applier interfaces. Finally, the computer-implemented method interacts, via the AI engine, with the user via a voice interaction to initiate the product/service application.

Type: Grant

Filed: December 7, 2022

Date of Patent: October 3, 2023

Assignee: Blue Lakes Technology, Inc.

Inventors: Anand Menon, Satyaprashvitha Nara
Configurable neural speech synthesis

Patent number: 11741941

Abstract: A discriminator trained on labeled samples of speech can compute probabilities of voice properties. A speech synthesis generative neural network that takes in text and continuous scale values of voice properties is trained to synthesize speech audio that the discriminator will infer as matching the values of the input voice properties. Voice parameters can include speaker voice parameters, accents, and attitudes, among others. Training can be done by transfer learning from an existing neural speech synthesis model or such a model can be trained with a loss function that considers speech and parameter values. A graphical user interface can allow voice designers for products to synthesize speech with a desired voice or generate a speech synthesis engine with frozen voice parameters. A vector of parameters can be used for comparison to previously registered voices in databases such as ones for trademark registration.

Type: Grant

Filed: June 7, 2021

Date of Patent: August 29, 2023

Assignee: SoundHound, Inc

Inventor: Andrew Richards
Method and voice assistance apparatus for providing an intelligence response

Patent number: 11741954

Abstract: Provided are a method and an apparatus for providing an intelligent voice response at a voice assistant device. The method includes obtaining, by a voice assistant device, a voice input from a user, identifying non-speech input while obtaining the voice input, determining a correlation between the voice input and the non-speech input, and generating, based on the correlation, a response comprising an action related to the correlation or a suggestion related to the correlation.

Type: Grant

Filed: February 11, 2021

Date of Patent: August 29, 2023

Assignee: SAMSUNG ELEOTRONICS CO., LTD.

Inventors: Vinay Vasanth Patage, Sourabh Tiwari, Ravibhushan B. Tayshete
Information processing apparatus

Patent number: 11455985

Abstract: An information processing apparatus determines, on the basis of a speech of a user to be evaluated, a reference feature quantity representing a feature of the user's speech at normal times, acquires audio feature quantity data of a target speech to be evaluated made by the user, and evaluates the feature of the target speech on the basis of a comparison result between the audio feature quantity of the target speech and the reference feature quantity.

Type: Grant

Filed: February 9, 2017

Date of Patent: September 27, 2022

Assignee: SONY INTERACTIVE ENTERTAINMENT INC.

Inventors: Shinichi Kariya, Shinichi Honda, Hiroyuki Segawa
Detecting voice grocery concepts from catalog items

Patent number: 11430445

Abstract: A system including one or more processors and one or more non-transitory computer-readable media storing computing instructions configured to run on the one or more processors and perform receiving a voice command from a user to perform a virtual action intended to apply to one item of two or more items in a cart of the user; generating a concept vector representing a concept in the voice command; transforming the respective item attributes for each of the two or more items into a respective feature vector; generating a respective candidate score for the each of the two or more items; identifying the one item to which the voice command is intended to apply; and executing an action with respect to the one item based on the voice command. Other embodiments are disclosed.

Type: Grant

Filed: January 30, 2020

Date of Patent: August 30, 2022

Assignee: WALMART APOLLO, LLC

Inventors: Ghodratollah Aalipour Hafshejani, Phani Ram Sayapaneni
Speech communication system and method with human-machine coordination

Patent number: 11380327

Abstract: The present disclosure relates to a field of intelligent communications, and discloses a speech communication system and method with human-machine coordination, which resolve a problem of bad client experience because great differences occur after a switchover in a call through a prior human-machine coordination and time of a client is wasted.

Type: Grant

Filed: April 2, 2021

Date of Patent: July 5, 2022

Assignee: NANJING SILICON INTELLIGENCE TECHNOLOGY CO., LTD.

Inventor: Huapeng Sima
Speech synthesis method and device

Patent number: 11361751

Abstract: In a speech synthesis method, an emotion intensity feature vector is set for a target synthesis text, an acoustic feature vector corresponding to an emotion intensity is generated based on the emotion intensity feature vector by using an acoustic model, and a speech corresponding to the emotion intensity is synthesized based on the acoustic feature vector. The emotion intensity feature vector is continuously adjustable, and emotion speeches of different intensities can be generated based on values of different emotion intensity feature vectors, so that emotion types of a synthesized speech are more diversified. This application may be applied to a human-computer interaction process in the artificial intelligence (AI) field, to perform intelligent emotion speech synthesis.

Type: Grant

Filed: April 8, 2021

Date of Patent: June 14, 2022

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Liqun Deng, Jiansheng Wei, Wenhua Sun
System and method for client voice building

Patent number: 8311830

Abstract: Provided is a system and method for building and managing a customized voice of an end-user, comprising the steps of designing a set of prompts for collection from the user, wherein the prompts are selected from both an analysis tool and by the user's own choosing to capture voice characteristics unique to the user. The prompts are delivered to the user over a network to allow the user to save a user recording on a server of a service provider. This recording is then retrieved and stored on the server and then set up on the server to build a voice database using text-to-speech synthesis tools. A graphical interface allows the user to continuously refine the data file to improve the voice and customize parameter and configuration settings, thereby forming a customized voice database which can be deployed or accessed.

Type: Grant

Filed: December 6, 2011

Date of Patent: November 13, 2012

Assignee: Cepstral, LLC

Inventors: Craig F. Campbell, Kevin A. Lenzo, Alexandre D. Cox
Voice emphasizing device and voice emphasizing method

Patent number: 8311831

Abstract: A voice emphasizing device emphasizes in a speech a “strained rough voice” at a position where a speaker or user of the speech intends to generate emphasis or musical expression. Thereby, the voice emphasizing device can provide the position with emphasis of anger, excitement, tension, or an animated way of speaking, or musical expression of Enka (Japanese ballad), blues, rock, or the like. As a result, rich vocal expression can be achieved. The voice emphasizing device includes: an emphasis utterance section detection unit (12) detecting, from an input speech waveform, an emphasis section that is a time duration having a waveform intended by the speaker or user to be converted; and a voice emphasizing unit (13) increasing fluctuation of an amplitude envelope of the waveform in the detected emphasis section.

Type: Grant

Filed: September 29, 2008

Date of Patent: November 13, 2012

Assignee: Panasonic Corporation

Inventors: Yumiko Kato, Takahiro Kamai, Masakatsu Hoshimi
System and method for client voice building

Patent number: 8086457

Abstract: Provided is a system and method for building and managing a customized voice of an end-user, comprising the steps of designing a set of prompts for collection from the user, wherein the prompts are selected from both an analysis tool and by the user's own choosing to capture voice characteristics unique to the user. The prompts are delivered to the user over a network to allow the user to save a user recording on a server of a service provider. This recording is then retrieved and stored on the server and then set up on the server to build a voice database using text-to-speech synthesis tools. A graphical interface allows the user to continuously refine the data file to improve the voice and customize parameter and configuration settings, thereby forming a customized voice database which can be deployed or accessed.

Type: Grant

Filed: May 29, 2008

Date of Patent: December 27, 2011

Assignee: Cepstral, LLC

Inventors: Craig F. Campbell, Kevin A. Lenzo, Alexandre D. Cox
Audio signal transforming

Publication number: 20090076822

Abstract: A sequence is received of time domain digital audio samples representing sound (e.g., a sound generated by a human voice or a musical instrument). The time domain digital audio samples are processed to derive a corresponding sequence of audio pulses in the time domain. Each of the audio pulses is associated with a characteristic frequency. Frequency domain information is derived about each of at least some of the audio pulses. The sound represented by the time domain digital audio samples is transformed by processing the audio pulses using the frequency domain information.

Type: Application

Filed: September 13, 2007

Publication date: March 19, 2009

Inventor: Jordi Bonada Sanjaume
Pronunciation correction of text-to-speech systems between different spoken languages

Publication number: 20090006097

Abstract: Pronunciation correction for text-to-speech (TTS) systems and speech recognition (SR) systems between different languages is provided. If a word requiring pronunciation by a target language TTS or SR is from a same language as the target language, but is not found in a lexicon of words from the target language, a letter-to-speech (LTS) rules set of the target language is used to generate a letter-to-speech output for the word for use by the TTS or SR configured according to the target language. If the word is from a different language as the target language, phonemes comprising the word according to its native language are mapped to phonemes of the target language. The phoneme mapping is used by the TTS or SR configured according to the target language for generating or recognizing an audible form of the word according to the target language.

Type: Application

Filed: June 29, 2007

Publication date: January 1, 2009

Applicant: Microsoft Corporation

Inventors: Cameron Ali Etezadi, Timothy David Sharpe
PROSODY MODIFICATION DEVICE, PROSODY MODIFICATION METHOD, AND RECORDING MEDIUM STORING PROSODY MODIFICATION PROGRAM

Publication number: 20080235025

Abstract: A prosody modification device includes: a real voice prosody input part that receives real voice prosody information extracted from an utterance of a human; a regular prosody generating part that generates regular prosody information having a regular phoneme boundary that determines a boundary between phonemes and a regular phoneme length of a phoneme by using data representing a regular or statistical phoneme length in an utterance of a human with respect to a section including at least a phoneme or a phoneme string to be modified in the real voice prosody information; and a real voice prosody modification part that resets a real voice phoneme boundary by using the generated regular prosody information so that the real voice phoneme boundary and a real voice phoneme length of the phoneme or the phoneme string to be modified in the real voice prosody information are approximate to an actual phoneme boundary and an actual phoneme length of the utterance of the human, thereby modifying the real voice prosody in

Type: Application

Filed: February 11, 2008

Publication date: September 25, 2008

Applicant: FUJITSU LIMITED

Inventors: Kentaro Murase, Nobuyuki Katae