Patents by Inventor Sunil Bharitkar

Sunil Bharitkar has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Deep learning based voice extraction and primary-ambience decomposition for stereo to surround upmixing with dialog-enhanced center channel

Patent number: 12581266

Abstract: One embodiment provides a computer-implemented method that includes determining directional sounds from a content mix using a machine learning unmixing model. The directional sounds are panned in an upmixed signal. Signal-dependent upmixing gains for specific frequency bins are computed on a frame-basis using a machine learning model for the upmixed signal. Dedicated voice clarity gains are computed using a hearing impairment model for multiple hearing-impaired profiles for achieving dialog enhancement. The signal dependent upmixing gains and voice clarity gains are transmitted as metadata with a downmixed signal representing the content mix.

Type: Grant

Filed: December 28, 2023

Date of Patent: March 17, 2026

Assignee: Samsung Electronics Co., Ltd.

Inventors: Sunil Bharitkar, Ricardo Thaddeus Páez Amaro, Carlos Tejeda Ocampo, Luis Madrid Herrera
Signal normalization using loudness metadata for audio processing

Patent number: 12563339

Abstract: One embodiment provides a method of signal normalization. The method comprises receiving an input content with a corresponding audio signal, and extracting loudness metadata from an audio signal corresponding to the input content. The method further comprises estimating, using a machine learning model, a peak-level amplitude based on the loudness metadata. The peak-level amplitude represents a maximum linear amplitude of the audio signal over an entire duration of the input content. The method further comprises determining a gain based at least on the peak-level amplitude, and applying the gain to the audio signal. The resulting gain-scaled audio signal is provided to one or more speakers coupled to or integrated in an electronic device for audio playback.

Type: Grant

Filed: December 28, 2023

Date of Patent: February 24, 2026

Assignee: Samsung Electronics Co., Ltd.

Inventor: Sunil Bharitkar
Deep learning for multimedia classification

Patent number: 12462102

Abstract: One embodiment provides a computer-implemented method that includes utilizing text information obtained from a title of a media content item and a trainable model for improving accuracy for classification of the media content item. The trainable model is utilized using a sequence of text to numeric-vector embeddings for classification of the media content item. At least one of a word embedding model parameter or a latent semantic analysis dimension is jointly optimized using the text information, and a classifier model for maximizing accuracy of the classification of the media content item.

Type: Grant

Filed: October 3, 2023

Date of Patent: November 4, 2025

Assignee: Samsung Electronics Co., Ltd.

Inventor: Sunil Bharitkar
Surround sound to immersive audio upmixing based on video scene analysis

Patent number: 12445799

Abstract: One embodiment provides a method of audio upmixing comprising performing video scene analysis by segmenting visual objects from video frames of a video, and performing audio analysis by extracting audio signals from an audio corresponding to the video. The method further comprises determining whether any of the audio signals correspond to any of the visual objects, and estimating a video-based trajectory of a visual object if the visual object is in motion and transitions from on-screen to off-screen, or vice versa, during the video. The method further comprises positioning an audio trajectory of an audio signal from at least one speaker associated with the display to at least one other speaker associated with providing surround sound. The audio trajectory is automatically matched with the video. The audio signal is delivered to the at least one speaker and the at least one other speaker for audio reproduction during the presentation.

Type: Grant

Filed: September 27, 2023

Date of Patent: October 14, 2025

Assignee: Samsung Electronics Co., Ltd.

Inventors: Allan Devantier, Sunil Bharitkar, Seongnam Oh, Carlos Tejeda Ocampo
Adaptive Ambisonics Compression

Publication number: 20250279104

Abstract: In one embodiment, a method includes accessing a set of Ambisonics data encoding an audio signal and including multiple Ambisonics channels. The method further includes applying a singular value decomposition transform to the set of Ambisonics data to decorrelate the Ambisonics channels; determining, for each decorrelated Ambisonics channel, a relative energy of that decorrelated Ambisonics channel, based on the decorrelating singular value decomposition transform; and determining, for each decorrelated Ambisonics channel, an allocated bitrate from an available bitrate, where the allocated bitrate is based on the relative energy of the respective decorrelated Ambisonics channel; and encoding each decorrelated Ambisonics channel according to the allocated bitrate.

Type: Application

Filed: November 10, 2024

Publication date: September 4, 2025

Inventors: Mahmoud Namazi, Toni Hirvonen, Sunil Bharitkar
Audio Codec Bitrate Using Directional Loudness

Publication number: 20250191593

Abstract: In one embodiment, a method includes accessing a window of audio that includes audio signals. The method further includes determining, for each of the audio signals, a power of that signal relative to a reference audio signal; determining, for each of the audio signals and based on the determined relative power of that signal, a gain to apply to the audio signal relative to the reference audio signal, where the gain depends on frequency and on directionality relative to a listener; and encoding the audio so that bandwidth is allocated based on directionality relative to the listener by allocating an amount of bandwidth to each of the audio signals based on their respective frequency-dependent and directionality-dependent determined gains.

Type: Application

Filed: November 6, 2024

Publication date: June 12, 2025

Inventors: Sunil Bharitkar, Toni Hirvonen
Bayesian optimization for simultaneous deconvolution of room impulse responses

Patent number: 12323780

Abstract: One embodiment provides a method comprising optimizing one or more stimuli parameters by applying machine learning to training data. The method further comprises determining, based on the one or more optimized stimuli parameters, stimuli for simultaneously exciting a plurality of speakers within a spatial area. The stimuli has a shortest possible duration that is accurate for simultaneous deconvolution of a plurality of impulse responses of the plurality of speakers. The method further comprises simultaneously exciting the plurality of speakers by providing the stimuli to the plurality of speakers at the same time for reproduction. The method further comprises simultaneously deconvolving the plurality of impulse responses based on the stimuli and one or more measurements of sound recorded during the reproduction and arriving at one or more microphones within the spatial area.

Type: Grant

Filed: November 9, 2022

Date of Patent: June 3, 2025

Assignee: Samsung Electronics Co., Ltd.

Inventor: Sunil Bharitkar
Double talk detectors

Patent number: 12267463

Abstract: In example implementations, an apparatus is provided. The apparatus includes an adaptive filter and a double talk detector in communication with the adaptive filter. The adaptive filter is to calculate a transfer function with coefficients for a particular time that is applied to an output signal of a microphone to cancel echoes caused by a reference signal in the output signal of the microphone. The double talk detector is to determine a peak of the coefficients, detect double talk based on a location of the peak of the coefficients, and transmit a pause signal to the adaptive filter in response to detection of the double talk, wherein the pause signal is to pause a calculation of updates to the coefficients by the adaptive filter.

Type: Grant

Filed: April 15, 2020

Date of Patent: April 1, 2025

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: Srikanth Kuthuru, Sunil Bharitkar
Video-derived audio processing

Patent number: 12231865

Abstract: One embodiment provides a computer-implemented method that includes creating, during content production, an audio object and metadata associated with the audio object based on a motion vector analysis of an object in one or more image frames in a video. The method can include, during the content production, inserting the audio object and the metadata associated with the audio object into at least one of an audio encoder or a video encoder. The method can include, during content playback, rendering the audio object, without image frame analysis, based on decoding the audio object and parsing the metadata associated with the audio object.

Type: Grant

Filed: January 13, 2023

Date of Patent: February 18, 2025

Assignee: Samsung Electronics Co., Ltd.

Inventors: Sunil Bharitkar, Seongnam Oh, Carlos Tejeda Ocampo
SPECTROGRAM BASED TIME ALIGNMENT FOR INDEPENDENT RECORDING AND PLAYBACK SYSTEMS

Publication number: 20250048050

Abstract: One embodiment provides a computer-implemented method that includes sending a stimulus signal to a loudspeaker. A measurement signal is received via a microphone. The stimulus signal is transformed into a stimulus time-frequency representation. The measured signal is transformed into a measured time-frequency representation. At least one frequency value is selected between the stimulus time-frequency representation and the measured time-frequency representation. Correlation analysis is performed using the selected at least one frequency value. Based on the correlation analysis, a statistical mode is determined to produce a start-time of the stimulus signal.

Type: Application

Filed: July 31, 2024

Publication date: February 6, 2025

Inventors: Sunil Bharitkar, Allan Devantier, Ashish Y. Rawat
BINAURAL RENDERING FOR HEADPHONES USING METADATA PROCESSING

Publication number: 20250045010

Abstract: Embodiments are described for a method of rendering audio for playback through headphones comprising receiving digital audio content, receiving binaural rendering metadata generated by an authoring tool processing the received digital audio content, receiving playback metadata generated by a playback device, and combining the binaural rendering metadata and playback metadata to optimize playback of the digital audio content through the headphones.

Type: Application

Filed: August 7, 2024

Publication date: February 6, 2025

Applicant: Dolby Laboratories Licensing Corporation

Inventors: Nicolas R. TSINGOS, Rhonda WILSON, Sunil BHARITKAR, C. Phillip BROWN, Alan J. SEEFELDT, Remi AUDFRAY
SIGNAL NORMALIZATION USING LOUDNESS METADATA FOR AUDIO PROCESSING

Publication number: 20240276143

Abstract: One embodiment provides a method of signal normalization. The method comprises receiving an input content with a corresponding audio signal, and extracting loudness metadata from an audio signal corresponding to the input content. The method further comprises estimating, using a machine learning model, a peak-level amplitude based on the loudness metadata. The peak-level amplitude represents a maximum linear amplitude of the audio signal over an entire duration of the input content. The method further comprises determining a gain based at least on the peak-level amplitude, and applying the gain to the audio signal. The resulting gain-scaled audio signal is provided to one or more speakers coupled to or integrated in an electronic device for audio playback.

Type: Application

Filed: December 28, 2023

Publication date: August 15, 2024

Inventor: Sunil Bharitkar
Binaural rendering for headphones using metadata processing

Patent number: 12061835

Abstract: Embodiments are described for a method of rendering audio for playback through headphones comprising receiving digital audio content, receiving binaural rendering metadata generated by an authoring tool processing the received digital audio content, receiving playback metadata generated by a playback device, and combining the binaural rendering metadata and playback metadata to optimize playback of the digital audio content through the headphones.

Type: Grant

Filed: April 24, 2023

Date of Patent: August 13, 2024

Assignee: Dolby Laboratories Licensing Corporation

Inventors: Nicolas R. Tsingos, Rhonda Wilson, Sunil Bharitkar, C. Phillip Brown, Alan J. Seefeldt, Remi Audfray
DEEP LEARNING BASED VOICE EXTRACTION AND PRIMARY-AMBIENCE DECOMPOSITION FOR STEREO TO SURROUND UPMIXING WITH DIALOG-ENHANCED CENTER CHANNEL

Publication number: 20240267701

Abstract: One embodiment provides a computer-implemented method that includes determining directional sounds from a content mix using a machine learning unmixing model. The directional sounds are panned in an upmixed signal. Signal-dependent upmixing gains for specific frequency bins are computed on a frame-basis using a machine learning model for the upmixed signal. Dedicated voice clarity gains are computed using a hearing impairment model for multiple hearing-impaired profiles for achieving dialog enhancement. The signal dependent upmixing gains and voice clarity gains are transmitted as metadata with a downmixed signal representing the content mix.

Type: Application

Filed: December 28, 2023

Publication date: August 8, 2024

Inventors: Sunil Bharitkar, Ricardo Thaddeus Páez Amaro, Carlos Tejeda Ocampo, Luis Madrid Herrera
VIDEO-DERIVED AUDIO PROCESSING

Publication number: 20240244386

Abstract: One embodiment provides a computer-implemented method that includes creating, during content production, an audio object and metadata associated with the audio object based on a motion vector analysis of an object in one or more image frames in a video. The method can include, during the content production, inserting the audio object and the metadata associated with the audio object into at least one of an audio encoder or a video encoder. The method can include, during content playback, rendering the audio object, without image frame analysis, based on decoding the audio object and parsing the metadata associated with the audio object.

Type: Application

Filed: January 13, 2023

Publication date: July 18, 2024

Inventors: Sunil Bharitkar, SeongNam Oh, Carlos Tejeda Ocampo
SURROUND SOUND TO IMMERSIVE AUDIO UPMIXING BASED ON VIDEO SCENE ANALYSIS

Publication number: 20240196158

Abstract: One embodiment provides a method of audio upmixing comprising performing video scene analysis by segmenting visual objects from video frames of a video, and performing audio analysis by extracting audio signals from an audio corresponding to the video. The method further comprises determining whether any of the audio signals correspond to any of the visual objects, and estimating a video-based trajectory of a visual object if the visual object is in motion and transitions from on-screen to off-screen, or vice versa, during the video. The method further comprises positioning an audio trajectory of an audio signal from at least one speaker associated with the display to at least one other speaker associated with providing surround sound. The audio trajectory is automatically matched with the video. The audio signal is delivered to the at least one speaker and the at least one other speaker for audio reproduction during the presentation.

Type: Application

Filed: September 27, 2023

Publication date: June 13, 2024

Inventors: Allan Devantier, Sunil Bharitkar, Seongnam Oh, Carlos Tejeda Ocampo
DEEP LEARNING FOR MULTIMEDIA CLASSIFICATION

Publication number: 20240126990

Abstract: One embodiment provides a computer-implemented method that includes utilizing text information obtained from a title of a media content item and a trainable model for improving accuracy for classification of the media content item. The trainable model is utilized using a sequence of text to numeric-vector embeddings for classification of the media content item. At least one of a word embedding model parameter or a latent semantic analysis dimension is jointly optimized using the text information, and a classifier model for maximizing accuracy of the classification of the media content item.

Type: Application

Filed: October 3, 2023

Publication date: April 18, 2024

Inventor: Sunil Bharitkar
Perceptual bass extension with loudness management and artificial intelligence (AI)

Patent number: 11950089

Abstract: One embodiment provides a computer-implemented method that includes implementing a customizable compressor for at least one sidechain processing associated with a loudspeaker. Machine learning is applied to automatically tune one or more parameters of the at least one sidechain processing. One or more channels are extracted, including a low-frequency effects (LFE) channel, for nonlinear signal synthesis. A proportional power-sum-based mix-in of an LFE sidechain channel is applied into a non-LFE sidechain. The LFE sidechain channel is maintained within a specified threshold of being level, before and after nonlinear signal synthesis.

Type: Grant

Filed: March 8, 2022

Date of Patent: April 2, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Sunil Bharitkar, William I. Saba
Acoustic echo cancellation

Patent number: 11937076

Abstract: Acoustic echo cancellation for a video conference system is described. A location of a person in a room can be determined. An audio signal received from the location of the person can be captured using beamforming. An acoustic echo cancellation parameter can be determined based in part on the audio signal captured from the location of the person. Acoustic echo cancellation can be performed on the audio signal using the acoustic echo cancellation parameter.

Type: Grant

Filed: July 3, 2019

Date of Patent: March 19, 2024

Assignee: Hewlett-Packard Development Copmany, L.P.

Inventors: Srikanth Kuthuru, Sunil Bharitkar, Madhu Sudan Athreya
BINAURAL RENDERING FOR HEADPHONES USING METADATA PROCESSING

Publication number: 20230385013

Abstract: Embodiments are described for a method of rendering audio for playback through headphones comprising receiving digital audio content, receiving binaural rendering metadata generated by an authoring tool processing the received digital audio content, receiving playback metadata generated by a playback device, and combining the binaural rendering metadata and playback metadata to optimize playback of the digital audio content through the headphones.

Type: Application

Filed: April 24, 2023

Publication date: November 30, 2023

Applicant: Dolby Laboratories Licensing Corporation

Inventors: Nicolas R. TSINGOS, Rhonda WILSON, Sunil BHARITKAR, C. Phillip BROWN, Alan J. SEEFELDT, Remi AUDFRAY

1 2 3 4 5 … next