Patents by Inventor Dipanjan Sen

Dipanjan Sen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11240623
    Abstract: One or more processors may obtain a first distance between a first audio zone of the two or more audio zones associated with the one or more interest points within the first audio zone, and a first device position of a device, obtain a second distance between a second audio zone of the two or more audio zones associated with the one or more interest points within the second audio zone, and the first device position of the device, and obtain an updated first distance and updated second distance after movement of the device has changed from the first device position to a second device position. The one or more processor(s) may independently control the first audio zone and the second audio zone, such that the audio data within the first audio zone and the second audio zone are adjusted based on the updated first distance and updated second distance.
    Type: Grant
    Filed: August 8, 2018
    Date of Patent: February 1, 2022
    Assignee: Qualcomm Incorporated
    Inventors: Nils Gunther Peters, S M Akramus Salehin, Shankar Thagadur Shivappa, Moo Young Kim, Dipanjan Sen
  • Publication number: 20220028401
    Abstract: A device configured to decode a bitstream, where the device includes a memory configured to store a temporally encoded representation of spatial audio signals. The device is also configured to receive the bitstream that includes an indication of a spatial transformation, and includes a temporal decoding unit, coupled to the memory, configured to decode one or more spatial audio signals represented in a spatial domain, where the one or more spatial audio signals are associated with different angles in the spatial domain. In addition, the device includes an inverse spatial transformation unit, coupled to the temporal decoding unit, is configured to convert the one or more spatial audio signals represented in the spatial domain into at least three ambisonic coefficients that, in part, represent a soundfield in an ambisonics domain, and perform a spatial transformation of the soundfield based on the indication of the spatial transformation received in the bitstream.
    Type: Application
    Filed: October 4, 2021
    Publication date: January 27, 2022
    Inventors: Nils Gunther Peters, Moo Young Kim, Dipanjan Sen
  • Publication number: 20220030372
    Abstract: In general, disclosed is a device that includes one or more processors, coupled to the memory, configured to perform an energy analysis with respect to one or more audio objects, in the ambisonics domain, in the first time segment. The one or more processors are also configured to perform a similarity measure between the one or more audio objects, in the ambisonics domain, in the first time segment, and the one or more audio objects, in the ambisonics domain, in the second time segment. In addition, the one or more processors are configured to perform a reorder of the one or more audio objects, in the ambisonics domain, in the first time segment with the one or more audio objects, in the ambisonics domain, in the second time segment, to generate one or more reordered audio objects in the first time segment.
    Type: Application
    Filed: October 11, 2021
    Publication date: January 27, 2022
    Inventors: Dipanjan Sen, Sang-Uk Ryu
  • Patent number: 11184731
    Abstract: In general, techniques are described for rendering metadata to control user movement based audio rendering. A device comprising a memory and one or more processors may be configured to perform the techniques. The memory may be configured to store audio data representative of a soundfield. The one or more processors may be coupled to the memory, and configured to obtain rendering metadata indicative of controls for enabling or disabling adaptations, based on an indication of a movement of a user of the device, of a renderer used to render audio data representative of a soundfield, specify, in a bitstream representative of the audio data, the rendering metadata, and output the bitstream.
    Type: Grant
    Filed: March 18, 2020
    Date of Patent: November 23, 2021
    Inventors: Nils Günther Peters, Moo Young Kim, S M Akramus Salehin, Siddhartha Goutham Swaminathan, Isaac Garcia Munoz, Dipanjan Sen
  • Patent number: 11164606
    Abstract: An example device includes a memory device, and a processor coupled to the memory device. The memory is configured to store audio spatial metadata associated with a soundfield and video data. The processor is configured to identify one or more foreground audio objects of the soundfield using the audio spatial metadata stored to the memory device, and to select, based on the identified one or more foreground audio objects, one or more viewports associated with the video data. Display hardware coupled to the processor and the memory device is configured to output a portion of the video data being associated with the one or more viewports selected by the processor.
    Type: Grant
    Filed: August 8, 2017
    Date of Patent: November 2, 2021
    Assignee: Qualcomm Incorporated
    Inventors: Nils Günther Peters, Shankar Thagadur Shivappa, Dipanjan Sen
  • Patent number: 11146903
    Abstract: In general, techniques are described for compressing decomposed representations of a sound field. A device comprising one or more processors may be configured to perform the techniques. The one or more processors may be configured to obtain a bitstream comprising a compressed version of a spatial component of a sound field, the spatial component generated by performing a vector based synthesis with respect to a plurality of spherical harmonic coefficients.
    Type: Grant
    Filed: May 28, 2014
    Date of Patent: October 12, 2021
    Assignee: Qualcomm Incorporated
    Inventors: Dipanjan Sen, Sang-Uk Ryu
  • Patent number: 11138983
    Abstract: In general, techniques are described for signaling layers for scalable coding of higher order ambisonic audio data. A device comprising a memory and a processor may be configured to perform the techniques. The memory may be configured to store the bitstream. The processor may be configured to obtain, from the bitstream, an indication of a number of layers specified in the bitstream, and obtain the layers of the bitstream based on the indication of the number of layers.
    Type: Grant
    Filed: August 30, 2019
    Date of Patent: October 5, 2021
    Assignee: Qualcomm Incorporated
    Inventors: Moo Young Kim, Nils Günther Peters, Dipanjan Sen
  • Patent number: 11128976
    Abstract: In general, techniques are described for modeling occlusions when rendering audio data. A device comprising a memory and one or more processors may perform the techniques. The memory may store audio data representative of a soundfield. The one or more processors may obtain occlusion metadata representative of an occlusion within the soundfield in terms of propagation of sound through the occlusion, the occlusion separating the soundfield into two or more sound spaces. The one or more processors may obtain a location of the device, and obtain, based on the occlusion metadata and the location, a renderer by which to render the audio data into one or more speaker feeds that account for propagation of the sound in one of the two or more sound spaces in which the device resides. The one or more processors may apply the renderer to the audio data to generate the speaker feeds.
    Type: Grant
    Filed: September 26, 2019
    Date of Patent: September 21, 2021
    Assignee: QUALCOMM Incorporated
    Inventors: Isaac Garcia Munoz, Siddhartha Goutham Swaminathan, S M Akramus Salehin, Moo Young Kim, Nils Günther Peters, Dipanjan Sen
  • Publication number: 20210281967
    Abstract: A device and method for backward compatibility for virtual reality (VR), mixed reality (MR), augmented reality (AR), computer vision, and graphics systems. The device and method enable rendering audio data with more degrees of freedom on devices that support fewer degrees of freedom. The device includes memory configured to store audio data representative of a soundfield captured at a plurality of capture locations, metadata that enables the audio data to be rendered to support N degrees of freedom, and adaptation metadata that enables the audio data to be rendered to support M degrees of freedom. The device also includes one or more processors coupled to the memory, and configured to adapt, based on the adaptation metadata, the audio data to provide the M degrees of freedom, and generate speaker feeds based on the adapted audio data.
    Type: Application
    Filed: May 24, 2021
    Publication date: September 9, 2021
    Inventors: Moo Young KIM, Nils Günther PETERS, S M Akramus SALEHIN, Siddhartha Goutham SWAMINATHAN, Dipanjan SEN
  • Publication number: 20210264927
    Abstract: An example audio decoding device includes a memory configured to store at least a portion of a coded audio bitstream; and one or more processors configured to: decode, based on the coded audio bitstream, a representation of a soundfield; decode, based on the coded audio bitstream, a syntax element indicating a selection of either a head-related transfer function (HRTF) or a binaural room impulse response (BRIR); and render, using the selected HRTF or BRIR, speaker feeds from the soundfield.
    Type: Application
    Filed: February 19, 2021
    Publication date: August 26, 2021
    Inventors: Moo Young Kim, Nils Günther Peters, Dipanjan Sen, Siddhartha Goutham Swaminathan, S M Akramus Salehin, Jason Filos
  • Patent number: 11089428
    Abstract: In general, various aspects of the techniques are described for selecting audio streams based on motion. A device comprising a processor and a memory may be configured to perform the techniques. The processor may be configured to obtain a current location of the device, and obtain capture locations. Each of the capture locations may identify a location at which a respective one of audio streams is captured. The processor may also be configured to select, based on the current location and the capture locations, a subset of the audio streams, where the subset of the audio streams have less audio streams than the audio streams. The processor may further be configured to reproduce, based on the subset of the audio streams, a soundfield. The memory may be configured to store the subset of the plurality of audio streams.
    Type: Grant
    Filed: December 13, 2019
    Date of Patent: August 10, 2021
    Assignee: QUALCOMM Incorporated
    Inventors: S M Akramus Salehin, Siddhartha Goutham Swaminathan, Dipanjan Sen
  • Patent number: 11081116
    Abstract: In general, techniques are described by which to embed enhanced audio transports in backward compatible bitstreams. A device comprising a memory and one or more processors may be configured to perform the techniques. The memory may store the backward compatible bitstream, which conforms to a legacy transport format. The processor(s) may obtain, from the backward compatible bitstream, legacy audio data that conforms to a legacy audio format, and obtain, from the backward compatible bitstream, extended audio data that enhances the legacy audio data. The processor(s) may also obtain, based on the legacy audio data and the extended audio data, enhanced audio data that conforms to an enhanced audio format, and output the enhanced audio data to one or more speakers.
    Type: Grant
    Filed: June 24, 2019
    Date of Patent: August 3, 2021
    Assignee: Qualcomm Incorporated
    Inventors: Shankar Thagadur Shivappa, Richard Paul Walters, Dipanjan Sen, Nils Günther Peters, Moo Young Kim
  • Patent number: 11062713
    Abstract: In general, techniques are described by which to specify spatially formatted enhanced audio data for backward compatible audio bitstreams. A device comprising a memory and one or more processors may be configured to perform the techniques. The memory may store the backward compatible bitstream that conforms to a legacy transport format. The processor(s) may obtain, from the backward compatible bitstream, legacy audio data that conforms to a legacy audio format and a spatially formatted extended audio stream. The processor(s) may process the spatially formatted extended audio stream to obtain extended audio data that enhances the legacy audio data. The processor(s) may next obtain, based on the legacy audio data and the extended audio data, enhanced audio data that conforms to an enhanced audio format. The processor(s) may output the enhanced audio data to one or more speakers.
    Type: Grant
    Filed: June 24, 2019
    Date of Patent: July 13, 2021
    Assignee: Qualcomm Incorported
    Inventors: Nils Günther Peters, Ferdinando Olivieri, Moo Young Kim, Dipanjan Sen, Shankar Thagadur Shivappa
  • Publication number: 20210185470
    Abstract: In general, various aspects of the techniques are described for selecting audio streams based on motion. A device comprising a processor and a memory may be configured to perform the techniques. The processor may be configured to obtain a current location of the device, and obtain capture locations. Each of the capture locations may identify a location at which a respective one of audio streams is captured. The processor may also be configured to select, based on the current location and the capture locations, a subset of the audio streams, where the subset of the audio streams have less audio streams than the audio streams. The processor may further be configured to reproduce, based on the subset of the audio streams, a soundfield. The memory may be configured to store the subset of the plurality of audio streams.
    Type: Application
    Filed: December 13, 2019
    Publication date: June 17, 2021
    Inventors: S M Akramus Salehin, Siddhartha Goutham Swaminathan, Dipanjan Sen
  • Patent number: 11026019
    Abstract: A device to apply noise reduction to ambisonic signals includes a memory configured to store noise data corresponding to microphones in a microphone array. A processor is configured to perform signal processing operations on signals captured by microphones in the microphone array to generate multiple sets of ambisonic signals including a first set corresponding to a first particular ambisonic order and a second set corresponding to a second particular ambisonic order. The processor is configured to perform a first noise reduction operation that includes applying a first gain factor to each ambisonic signal in the first set and to perform a second noise reduction operation that includes applying a second gain factor to each ambisonic signal in the second set. The first gain factor and the second gain factor are based on the noise data, and the second gain factor is distinct from the first gain factor.
    Type: Grant
    Filed: March 13, 2019
    Date of Patent: June 1, 2021
    Assignee: Qualcomm Incorporated
    Inventors: S M Akramus Salehin, Dipanjan Sen
  • Patent number: 11019449
    Abstract: A device and method for backward compatibility for virtual reality (VR), mixed reality (MR), augmented reality (AR), computer vision, and graphics systems. The device and method enable rendering audio data with more degrees of freedom on devices that support fewer degrees of freedom. The device includes memory configured to store audio data representative of a soundfield captured at a plurality of capture locations, metadata that enables the audio data to be rendered to support N degrees of freedom, and adaptation metadata that enables the audio data to be rendered to support M degrees of freedom. The device also includes one or more processors coupled to the memory, and configured to adapt, based on the adaptation metadata, the audio data to provide the M degrees of freedom, and generate speaker feeds based on the adapted audio data.
    Type: Grant
    Filed: September 11, 2019
    Date of Patent: May 25, 2021
    Assignee: QUALCOMM Incorporated
    Inventors: Moo Young Kim, Nils Günther Peters, S M Akramus Salehin, Siddhartha Goutham Swaminathan, Dipanjan Sen
  • Patent number: 10999693
    Abstract: In general, techniques are described by which to render different portions of audio data using different renderers. A device comprising a memory and one or more processors may be configured to perform the techniques. The memory may store audio renderers. The processor(s) may obtain a first audio renderer of the plurality of audio renderers, and apply the first audio renderer with respect to a first portion of the audio data to obtain one or more first speaker feeds. The processor(s) may next obtain a second audio renderer of the plurality of audio renderers, and apply the second audio renderer with respect to a second portion of the audio data to obtain one or more second speaker feeds. The processor(s) may output, to one or more speakers, the one or more first speaker feeds and the one or more second speaker feeds.
    Type: Grant
    Filed: June 24, 2019
    Date of Patent: May 4, 2021
    Assignee: Qualcomm Incorporated
    Inventors: Moo Young Kim, Ferdinando Olivieri, Dipanjan Sen
  • Patent number: 10986456
    Abstract: In general, techniques are described by which to perform spatial relation coding using virtual higher order ambisonic coefficients. A device comprising a memory and a processor may perform the techniques. The memory may be configured to store audio data, the audio data representative of zero-ordered higher order ambisonic (HOA) coefficient, and one or more greater-than-zero-ordered HOA coefficients. The processor may be configured to obtain, based on the one or more greater-than-zero-ordered HOA coefficients, a virtual zero-ordered HOA coefficient. The processor may also be configured to obtain, based on the virtual HOA coefficient, one or more parameters from which to synthesize the one or more greater-than-zero-ordered HOA coefficients. The processor may further be configured to generate a bitstream that includes a first indication representative of the zero-ordered HOA coefficients, and a second indication representative of the one or more parameters.
    Type: Grant
    Filed: October 4, 2018
    Date of Patent: April 20, 2021
    Assignee: Qualcomm Incorporated
    Inventors: Jeongook Song, Dipanjan Sen
  • Patent number: 10972853
    Abstract: A device for processing coded audio is disclosed. The device is configured to store an audio object and audio object metadata associated with the audio object. The audio object metadata includes frequency dependent beam pattern metadata. The device may apply, based on the frequency dependent beam pattern metadata, a renderer to the audio object to obtain one or more speaker feeds and output the one or more speaker feeds.
    Type: Grant
    Filed: December 18, 2019
    Date of Patent: April 6, 2021
    Assignee: Qualcomm Incorporated
    Inventors: Moo Young Kim, Nils Günther Peters, S M Akramus Salehin, Dipanjan Sen
  • Patent number: 10972851
    Abstract: In general, techniques are described by which to perform spatial relation coding of higher order ambisonic coefficients using expanded parameters. A device comprising a memory and a processor may perform the techniques. The memory may be configured to store at least a portion of a bitstream, the bitstream including a first indication representative of an HOA coefficient associated with the spherical basis function having an order of zero, and a second indication representative of one or more parameters. The processor may be configured to perform parameter expansion with respect to the one or more parameters to obtain one or more expanded parameters, and synthesize, based on the one or more expanded parameters and the HOA coefficient associated with the spherical basis function having the order of zero, one or more HOA coefficients associated with one or more spherical basis functions having an order greater than zero.
    Type: Grant
    Filed: October 4, 2018
    Date of Patent: April 6, 2021
    Assignee: Qualcomm Incorporated
    Inventors: Jeongook Song, Dipanjan Sen