Patents Assigned to VOCALID, INC.

Image-based approaches to identifying the source of audio data

Patent number: 11062698

Abstract: Image-based machine learning approaches are used to classify audio data, such as speech data as authentic or otherwise. For example, audio data can be obtained and a visual representation of the audio data can be generated. The visual representation can include, for example, an image such as a spectrogram or other visual or electronic representation of the audio data. Before processing the image, the audio data and/or image may undergo various preprocessing techniques. Thereafter, the image representation of the audio data can be analyzed using a trained model to classify the audio data as authentic or otherwise.

Type: Grant

Filed: October 24, 2019

Date of Patent: July 13, 2021

Assignee: VocaliD, INC.

Inventors: Rupal Patel, Geoffrey S Meltzner, Markus Toman
IMAGE-BASED APPROACHES TO IDENTIFYING THE SOURCE OF AUDIO DATA

Publication number: 20200184955

Abstract: Image-based machine learning approaches are used to classify audio data, such as speech data as authentic or otherwise. For example, audio data can be obtained and a visual representation of the audio data can be generated. The visual representation can include, for example, an image such as a spectrogram or other visual or electronic representation of the audio data. Before processing the image, the audio data and/or image may undergo various preprocessing techniques. Thereafter, the image representation of the audio data can be analyzed using a trained model to classify the audio data as authentic or otherwise.

Type: Application

Filed: October 24, 2019

Publication date: June 11, 2020

Applicant: VocaliD, INC.

Inventors: Rupal PATEL, Geoffrey S. MELTZNER, Markus TOMAN
Image-based approaches to classifying audio data

Patent number: 10504504

Abstract: Image-based machine learning approaches are used to classify audio data, such as speech data as authentic or otherwise. For example, audio data can be obtained and a visual representation of the audio data can be generated. The visual representation can include, for example, an image such as a spectrogram or other visual or electronic representation of the audio data. Before processing the image, the audio data and/or image may undergo various preprocessing techniques. Thereafter, the image representation of the audio data can be analyzed using a trained model to classify the audio data as authentic or otherwise.

Type: Grant

Filed: December 7, 2018

Date of Patent: December 10, 2019

Assignee: VocaliD, INC.

Inventors: Geoffrey S Meltzner, Rupal Patel, Markus Toman
Aging a text-to-speech voice

Patent number: 9558734

Abstract: A voice recipient may request a text-to-speech (TTS) voice that corresponds to an age or age range. An existing TTS voice or existing voice data may be used to create a TTS voice corresponding to the requested age by encoding the voice data to voice parameter values, transforming the voice parameter values using a voice-aging model, synthesizing voice data using the transformed parameter values, and then creating a TTS voice using the transformed voice data. The voice-aging model may model how one or more voice parameters of a voice change with age and may be created from voice data stored in a voice bank.

Type: Grant

Filed: April 26, 2016

Date of Patent: January 31, 2017

Assignee: VOCALID, INC.

Inventors: Rupal Patel, Geoffrey Seth Meltzner
Distributed collection and processing of voice bank data

Patent number: 9336782

Abstract: Voice data may be collected by a plurality of voice donors and stored in a voice bank. A voice donor may authenticate to a voice collection system to start a session to provide voice data. During the voice collection session, the voice donor may be presented with a sequence of prompts to speak and voice data may be transferred to a server. The received voice data may be processed to determine the speech units spoken by the voice donor and a count of speech units received from the voice donor may be updated. Feedback may be provided to the voice donor indicating, for example, a progress of the voice collection, a quality level of the voice data, or information about speech unit counts. The voice bank may be used to create TTS voices for voice recipients, create a model of voice aging, or for other applications.

Type: Grant

Filed: June 29, 2015

Date of Patent: May 10, 2016

Assignee: VOCALID, INC.

Inventor: Rupal Patel

Image-based approaches to identifying the source of audio data

IMAGE-BASED APPROACHES TO IDENTIFYING THE SOURCE OF AUDIO DATA

Image-based approaches to classifying audio data

Aging a text-to-speech voice

Distributed collection and processing of voice bank data