Patents by Inventor Dipanjan Sen

Dipanjan Sen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11967329
    Abstract: An example audio decoding device includes a memory configured to store at least a portion of a coded audio bitstream; and one or more processors configured to: decode, based on the coded audio bitstream, a representation of a soundfield; decode, based on the coded audio bitstream, a syntax element indicating a selection of either a head-related transfer function (HRTF) or a binaural room impulse response (BRIR); and render, using the selected HRTF or BRIR, speaker feeds from the soundfield.
    Type: Grant
    Filed: February 19, 2021
    Date of Patent: April 23, 2024
    Assignee: QUALCOMM Incorporated
    Inventors: Moo Young Kim, Nils Günther Peters, Dipanjan Sen, Siddhartha Goutham Swaminathan, S M Akramus Salehin, Jason Filos
  • Patent number: 11962990
    Abstract: In general, disclosed is a device that includes one or more processors, coupled to the memory, configured to perform an energy analysis with respect to one or more audio objects, in the ambisonics domain, in the first time segment. The one or more processors are also configured to perform a similarity measure between the one or more audio objects, in the ambisonics domain, in the first time segment, and the one or more audio objects, in the ambisonics domain, in the second time segment. In addition, the one or more processors are configured to perform a reorder of the one or more audio objects, in the ambisonics domain, in the first time segment with the one or more audio objects, in the ambisonics domain, in the second time segment, to generate one or more reordered audio objects in the first time segment.
    Type: Grant
    Filed: October 11, 2021
    Date of Patent: April 16, 2024
    Assignee: QUALCOMM Incorporated
    Inventors: Dipanjan Sen, Sang-Uk Ryu
  • Publication number: 20240114313
    Abstract: A method that includes receiving a first bitstream that includes an encoded version of an audio signal for a three-dimensional (3D) scene and a first set of metadata that has 1) a position of a 3D sub-scene within the scene and 2) a position of a sound source associated with the audio signal within the sub-scene; determining a position of a listener; spatially rendering the scene to produce the sound source with the audio signal at the position of the sound source with respect to the position of the listener; receiving a second bitstream that includes a second set of metadata that has a different position of the sub-scene; and adjusting the spatial rendering of the scene such that the position of the sound source changes to correspond to movement of the sub-scene from the position of the sub-scene to the different position of the sub-scene.
    Type: Application
    Filed: September 21, 2023
    Publication date: April 4, 2024
    Inventors: Frank Baumgarte, Dipanjan Sen
  • Publication number: 20240114310
    Abstract: A method that includes receiving a bitstream that comprises: an encoded version of an audio signal that is associated with a sound source that is within a first 3D scene, a scene tree structure that includes an origin of the first scene relative to an origin of a second scene, and a position of the sound source within the first scene relative to the origin of the first scene, wherein the position references the origin of the first scene using an identifier, wherein the scene tree structure defines an initial configuration of the sound source with respect to the first and second scenes; determining a position of a listener; producing a set of spatially rendered audio signals by spatially rendering the audio signal according to the position of the sound source with respect to the position of the listener; and using the spatially rendered audio signals to drive speakers.
    Type: Application
    Filed: September 21, 2023
    Publication date: April 4, 2024
    Inventors: Frank Baumgarte, Dipanjan Sen
  • Publication number: 20240105196
    Abstract: A method that includes receiving an audio component associated with an audio scene, the audio component including an audio signal, determining a loudness level of the audio component based on the audio signal, receiving a target loudness level for the audio component, producing a bitstream with the audio component by encoding the audio signal and including metadata that has the loudness level and the target loudness level, and transmitting the bitstream to an electronic device.
    Type: Application
    Filed: September 20, 2023
    Publication date: March 28, 2024
    Inventors: Frank Baumgarte, Dipanjan Sen
  • Publication number: 20240105195
    Abstract: A method that includes receiving a bitstream that includes: a first signal of a first audio component associated with an audio scene, a first target loudness, and a first source loudness determined by an encoder side based on the first signal, and a second signal of a second audio component associated with the scene, a second target loudness, and a second source loudness determined by the encoder side based on the second signal; determining a first gain based on the first source and target loudness; determining a second gain based on the second source and target loudness; producing a first gain-adjusted signal by applying the first gain to the first signal; producing a second gain-adjusted signal by applying the second gain to the second signal; and producing the scene that includes the first and second audio components by combining the gain-adjusted audio signals into a group of signals.
    Type: Application
    Filed: September 20, 2023
    Publication date: March 28, 2024
    Inventors: Frank Baumgarte, Dipanjan Sen
  • Publication number: 20240098444
    Abstract: In one aspect, a computer-implemented method, includes obtaining object audio and metadata that spatially describes the object audio, converting the object audio to Ambisonics audio based on the metadata, encoding, in a first bit stream, the Ambisonics audio, and encoding, in a second bit stream, at least a subset of the metadata.
    Type: Application
    Filed: August 23, 2023
    Publication date: March 21, 2024
    Inventors: Sina Zamani, Moo Young Kim, Dipanjan Sen, Sang Uk Ryu, Juha O. Merimaa, Symeon Delikaris Manias
  • Publication number: 20240096335
    Abstract: In one aspect, a computer-implemented method, includes obtaining object audio and metadata that spatially describes the object audio, converting the object audio to time-frequency domain Ambisonics audio based on the metadata, and encoding the time-frequency domain Ambisonics audio and a subset of the metadata as one or more bitstreams to be stored in computer-readable memory or transmitted to a remote device.
    Type: Application
    Filed: August 23, 2023
    Publication date: March 21, 2024
    Inventors: Sina Zamani, Moo Young Kim, Dipanjan Sen, Sang Uk Ryu, Juha O. Merimaa, Symeon Delikaris Manias
  • Patent number: 11841899
    Abstract: A device with microphones can generate microphone signals during an audio recording. The device can store, in an electronic audio data file, the microphone signals, and metadata that includes impulse responses of the microphones. Other aspects are described and claimed.
    Type: Grant
    Filed: June 11, 2020
    Date of Patent: December 12, 2023
    Assignee: Apple Inc.
    Inventors: Jonathan D. Sheaffer, Symeon Delikaris Manias, Gaetan R. Lorho, Peter A. Raffensperger, Eric A. Allamanche, Frank Baumgarte, Dipanjan Sen, Joshua D. Atkins, Juha O. Merimaa
  • Patent number: 11843932
    Abstract: A device and method for backward compatibility for virtual reality (VR), mixed reality (MR), augmented reality (AR), computer vision, and graphics systems. The device and method enable rendering audio data with more degrees of freedom on devices that support fewer degrees of freedom. The device includes memory configured to store audio data representative of a soundfield captured at a plurality of capture locations, metadata that enables the audio data to be rendered to support N degrees of freedom, and adaptation metadata that enables the audio data to be rendered to support M degrees of freedom. The device also includes one or more processors coupled to the memory, and configured to adapt, based on the adaptation metadata, the audio data to provide the M degrees of freedom, and generate speaker feeds based on the adapted audio data.
    Type: Grant
    Filed: May 24, 2021
    Date of Patent: December 12, 2023
    Assignee: QUALCOMM Incorporated
    Inventors: Moo Young Kim, Nils Günther Peters, S M Akramus Salehin, Siddhartha Goutham Swaminathan, Dipanjan Sen
  • Publication number: 20230396921
    Abstract: A multi-radius spherical microphone that includes an inner body defining an inner sphere having an inner radius from a center; a plurality of inner microphones coupled to the inner spherical body and defining an array of inner microphones; an outer body defining an dodecahedron, wherein the inner body and the outer body are concentric about the center; and a plurality of outer microphones coupled to the outer body at respective vertices of the dodecahedron and defining an array of outer microphones, wherein each of the plurality of outer microphones is positioned radially equidistant from the center.
    Type: Application
    Filed: May 22, 2023
    Publication date: December 7, 2023
    Applicant: APPLE INC.
    Inventors: Abhaya Parthy, Dipanjan Sen, Bonnie W. Tom, Jonathan D. Sheaffer, Justin D. Crosby, Symeon Delikaris Manias, Emily A. Wigley
  • Publication number: 20230360655
    Abstract: Encoding and decoding of higher order ambisonics, HOA, data for purposes of bitrate reduction. One aspect uses principal components analysis to produce spatial descriptors. Other aspects include various spatial descriptor quantization techniques.
    Type: Application
    Filed: August 13, 2021
    Publication date: November 9, 2023
    Inventors: Moo Young KIM, Sina ZAMANI, Dipanjan SEN
  • Publication number: 20230360660
    Abstract: Disclosed are methods and systems for decoding immersive audio content encoded by an adaptive number of scene elements for channels, audio objects, higher-order ambisonics (HOA), and/or other sound field representations. The decoded audio is rendered to the speaker configuration of a playback device. For bit streams that represent audio scenes with a different mixture of channels, objects, and/or HOA in consecutive frames, fade-in of the new frame and fade-out of the old frame may be performed. Crossfading between consecutive frames happen in the speaker layout after rendering, in the spatially decoded content type before rendering, or between the transport channels as the output of the baseline decoder but before spatial decoding and rendering. Crossfading may use an immediate fade-in and fade-out frame (IFFF) for the transition frame or may use an overlap-add synthesis technique such as time-domain aliasing cancellation (TDAC) of MDCT.
    Type: Application
    Filed: September 10, 2021
    Publication date: November 9, 2023
    Inventors: Moo Young KIM, Dipanjan SEN, Eric ALLAMANCHE, J. Kevin Calhoun, Frank BAUMGARTE, Sina ZAMANI, Eric DAY
  • Publication number: 20230360661
    Abstract: Disclosed is a hierarchical spatial resolution codec that adaptively adjusts the representations of immersive audio content as the target bandwidth for delivering the audio content changes. The audio content may be represented by an adaptive number of content types such as channels/objects, higher-order ambisonics (HOA), and encoded by adaptive spatial coding techniques to support the target bitrate of a transmission channel or user. Adaptive spatial coding techniques may include adaptive channel/object spatial encoding techniques to generate an adaptive number of channels/objects, and adaptive HOA spatial encoding or HOA compression techniques to generate an adaptive order of the HOA. The adaptation may be a function of the target bitrate that is associated with a desired quality, and an analysis that determines the priority of the channels, objects, and HOA. High priority channels/objects may be encoded into a high quality bit-stream while low priority channels/objects may be converted and encoded as HOA.
    Type: Application
    Filed: August 31, 2021
    Publication date: November 9, 2023
    Inventors: Dipanjan SEN, Moo Young KIM, Frank BAUMGARTE, Sina ZAMANI, Aram LINDAHL
  • Publication number: 20230283977
    Abstract: A data structure stored in memory includes a scene description that defines a hierarchy of scene components that are in digital audio content received from a producer. The hierarchy has several stages including a fourth stage in which a scene composition is defined that contains all scene components needed to render the digital audio content in a single presentation, for instance as intended by the producer, and for input to a spatial audio renderer, wherein the scene composition contains one or more composition selection groups. Other aspects are also described and claimed.
    Type: Application
    Filed: February 14, 2023
    Publication date: September 7, 2023
    Inventors: Frank BAUMGARTE, Moo Young KIM, Dipanjan SEN, Sang Uk RYU
  • Patent number: 11664035
    Abstract: A device configured to decode a bitstream, where the device includes a memory configured to store a temporally encoded representation of spatial audio signals. The device is also configured to receive the bitstream that includes an indication of a spatial transformation, and includes a temporal decoding unit, coupled to the memory, configured to decode one or more spatial audio signals represented in a spatial domain, where the one or more spatial audio signals are associated with different angles in the spatial domain. In addition, the device includes an inverse spatial transformation unit, coupled to the temporal decoding unit, is configured to convert the one or more spatial audio signals represented in the spatial domain into at least three ambisonic coefficients that, in part, represent a soundfield in an ambisonics domain, and perform a spatial transformation of the soundfield based on the indication of the spatial transformation received in the bitstream.
    Type: Grant
    Filed: October 4, 2021
    Date of Patent: May 30, 2023
    Assignee: Qualcomm Incorporated
    Inventors: Nils Günther Peters, Moo Young Kim, Dipanjan Sen
  • Publication number: 20230104111
    Abstract: One or more acoustic parameters of a current acoustic environment of a user may be determined based on sensor signals captured by one or more sensors of the device. One or more preset acoustic parameters may be determined based on the one or more acoustic parameters of the current acoustic environment of the user and an acoustic environment of an audio file comprising audio signals that is determined based on the audio signals of the audio file or metadata of the audio file. The audio signals may be spatially rendered by applying spatial filters that include the one or more preset acoustic parameters to the audio signals, resulting in binaural audio signals. The binaural audio signals may be used to drive speakers of a headset. Other aspects are described and claimed.
    Type: Application
    Filed: August 19, 2022
    Publication date: April 6, 2023
    Inventors: Prateek Murgai, John E. Arthur, Joshua D. Atkins, Juha O. Merimaa, Dipanjan Sen, Brandon J. Rice, Alexander Singh Alvarado, Jonathan D. Sheaffer, Benjamin Bernard, David E. Romblom
  • Patent number: 11430451
    Abstract: A first layer of data having a first set of Ambisonic audio components can be decoded where the first set of Ambisonic audio components is generated based on ambience and one or more object-based audio signals. A second layer of data is decoded having at least one of the one or more object-based audio signals. One of the object-based audio signals is subtracted from the first set of Ambisonic audio components. The resulting Ambisonic audio components are rendered to generate a first set of audio channels. The one or more object-based audio signals are spatially rendered to generate a second set of audio channels. Other aspects are described and claimed.
    Type: Grant
    Filed: September 26, 2019
    Date of Patent: August 30, 2022
    Assignee: APPLE INC.
    Inventors: Dipanjan Sen, Frank Baumgarte, Juha O. Merimaa
  • Publication number: 20220262373
    Abstract: A first layer of data having a first set of Ambisonic audio components can be decoded where the first set of Ambisonic audio components is generated based on ambience and one or more object-based audio signals. A second layer of data is decoded having at least one of the one or more object-based audio signals. One of the object-based audio signals is subtracted from the first set of Ambisonic audio components. The resulting Ambisonic audio components are rendered to generate a first set of audio channels. The one or more object-based audio signals are spatially rendered to generate a second set of audio channels. Other aspects are described and claimed.
    Type: Application
    Filed: May 9, 2022
    Publication date: August 18, 2022
    Inventors: Dipanjan Sen, Frank Baumgarte, Juha O. Merimaa
  • Patent number: 11270711
    Abstract: In general, techniques are described by which to provide priority information for higher order ambisonic (HOA) audio data. A device comprising a memory and a processor may perform the techniques. The memory stores HOA coefficients of the HOA audio data, the HOA coefficients representative of a soundfield. The processor may decompose the HOA coefficients into a sound component and a corresponding spatial component, the corresponding spatial component defining shape, width, and directions of the sound component, and the corresponding spatial component defined in a spherical harmonic domain. The processor may also determine, based on one or more of the sound component and the corresponding spatial component, priority information indicative of a priority of the sound component relative to other sound components of the soundfield, and specify, in a data object representative of a compressed version of the HOA audio data, the sound component and the priority information.
    Type: Grant
    Filed: May 6, 2020
    Date of Patent: March 8, 2022
    Assignee: Qualcomm Incorproated
    Inventors: Moo Young Kim, Nils Günther Peters, Shankar Thagadur Shivappa, Dipanjan Sen