Abstract: A sound signal is received from a room of people, the sound signal including speech of a lecturer. A Fourier transform is applied to the sound signal to produce a spectrogram. An encoding of the spectrogram in a multi-dimensional space is computed using an encoder. A seed is found in the multi-dimensional space, where the seed is a position in the multi-dimensional space which encodes a spectrogram from another lecturer known to have performance on an automated speech recognition tool above a threshold. The encoding is modified by moving the location of the encoding in the multi-dimensional space towards the seed. The modified encoding is decoded into a decoded signal. A reverse Fourier transform is applied to the decoded signal to produce an output sound signal. The output sound signal is sent to the automated speech recognition tool to generate a transcript.
Type:
Application
Filed:
August 8, 2023
Publication date:
December 12, 2024
Applicant:
Habitat Learn Ltd
Inventors:
Yunjia LI, Jeremy Guy BRASSINGTON, Daniel Jeffery GOERZ