Patents by Inventor Dipanjan Sen
Dipanjan Sen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11967329Abstract: An example audio decoding device includes a memory configured to store at least a portion of a coded audio bitstream; and one or more processors configured to: decode, based on the coded audio bitstream, a representation of a soundfield; decode, based on the coded audio bitstream, a syntax element indicating a selection of either a head-related transfer function (HRTF) or a binaural room impulse response (BRIR); and render, using the selected HRTF or BRIR, speaker feeds from the soundfield.Type: GrantFiled: February 19, 2021Date of Patent: April 23, 2024Assignee: QUALCOMM IncorporatedInventors: Moo Young Kim, Nils Günther Peters, Dipanjan Sen, Siddhartha Goutham Swaminathan, S M Akramus Salehin, Jason Filos
-
Patent number: 11962990Abstract: In general, disclosed is a device that includes one or more processors, coupled to the memory, configured to perform an energy analysis with respect to one or more audio objects, in the ambisonics domain, in the first time segment. The one or more processors are also configured to perform a similarity measure between the one or more audio objects, in the ambisonics domain, in the first time segment, and the one or more audio objects, in the ambisonics domain, in the second time segment. In addition, the one or more processors are configured to perform a reorder of the one or more audio objects, in the ambisonics domain, in the first time segment with the one or more audio objects, in the ambisonics domain, in the second time segment, to generate one or more reordered audio objects in the first time segment.Type: GrantFiled: October 11, 2021Date of Patent: April 16, 2024Assignee: QUALCOMM IncorporatedInventors: Dipanjan Sen, Sang-Uk Ryu
-
Publication number: 20240114313Abstract: A method that includes receiving a first bitstream that includes an encoded version of an audio signal for a three-dimensional (3D) scene and a first set of metadata that has 1) a position of a 3D sub-scene within the scene and 2) a position of a sound source associated with the audio signal within the sub-scene; determining a position of a listener; spatially rendering the scene to produce the sound source with the audio signal at the position of the sound source with respect to the position of the listener; receiving a second bitstream that includes a second set of metadata that has a different position of the sub-scene; and adjusting the spatial rendering of the scene such that the position of the sound source changes to correspond to movement of the sub-scene from the position of the sub-scene to the different position of the sub-scene.Type: ApplicationFiled: September 21, 2023Publication date: April 4, 2024Inventors: Frank Baumgarte, Dipanjan Sen
-
Publication number: 20240114310Abstract: A method that includes receiving a bitstream that comprises: an encoded version of an audio signal that is associated with a sound source that is within a first 3D scene, a scene tree structure that includes an origin of the first scene relative to an origin of a second scene, and a position of the sound source within the first scene relative to the origin of the first scene, wherein the position references the origin of the first scene using an identifier, wherein the scene tree structure defines an initial configuration of the sound source with respect to the first and second scenes; determining a position of a listener; producing a set of spatially rendered audio signals by spatially rendering the audio signal according to the position of the sound source with respect to the position of the listener; and using the spatially rendered audio signals to drive speakers.Type: ApplicationFiled: September 21, 2023Publication date: April 4, 2024Inventors: Frank Baumgarte, Dipanjan Sen
-
Publication number: 20240105196Abstract: A method that includes receiving an audio component associated with an audio scene, the audio component including an audio signal, determining a loudness level of the audio component based on the audio signal, receiving a target loudness level for the audio component, producing a bitstream with the audio component by encoding the audio signal and including metadata that has the loudness level and the target loudness level, and transmitting the bitstream to an electronic device.Type: ApplicationFiled: September 20, 2023Publication date: March 28, 2024Inventors: Frank Baumgarte, Dipanjan Sen
-
Publication number: 20240105195Abstract: A method that includes receiving a bitstream that includes: a first signal of a first audio component associated with an audio scene, a first target loudness, and a first source loudness determined by an encoder side based on the first signal, and a second signal of a second audio component associated with the scene, a second target loudness, and a second source loudness determined by the encoder side based on the second signal; determining a first gain based on the first source and target loudness; determining a second gain based on the second source and target loudness; producing a first gain-adjusted signal by applying the first gain to the first signal; producing a second gain-adjusted signal by applying the second gain to the second signal; and producing the scene that includes the first and second audio components by combining the gain-adjusted audio signals into a group of signals.Type: ApplicationFiled: September 20, 2023Publication date: March 28, 2024Inventors: Frank Baumgarte, Dipanjan Sen
-
Publication number: 20240098444Abstract: In one aspect, a computer-implemented method, includes obtaining object audio and metadata that spatially describes the object audio, converting the object audio to Ambisonics audio based on the metadata, encoding, in a first bit stream, the Ambisonics audio, and encoding, in a second bit stream, at least a subset of the metadata.Type: ApplicationFiled: August 23, 2023Publication date: March 21, 2024Inventors: Sina Zamani, Moo Young Kim, Dipanjan Sen, Sang Uk Ryu, Juha O. Merimaa, Symeon Delikaris Manias
-
Publication number: 20240096335Abstract: In one aspect, a computer-implemented method, includes obtaining object audio and metadata that spatially describes the object audio, converting the object audio to time-frequency domain Ambisonics audio based on the metadata, and encoding the time-frequency domain Ambisonics audio and a subset of the metadata as one or more bitstreams to be stored in computer-readable memory or transmitted to a remote device.Type: ApplicationFiled: August 23, 2023Publication date: March 21, 2024Inventors: Sina Zamani, Moo Young Kim, Dipanjan Sen, Sang Uk Ryu, Juha O. Merimaa, Symeon Delikaris Manias
-
Patent number: 11841899Abstract: A device with microphones can generate microphone signals during an audio recording. The device can store, in an electronic audio data file, the microphone signals, and metadata that includes impulse responses of the microphones. Other aspects are described and claimed.Type: GrantFiled: June 11, 2020Date of Patent: December 12, 2023Assignee: Apple Inc.Inventors: Jonathan D. Sheaffer, Symeon Delikaris Manias, Gaetan R. Lorho, Peter A. Raffensperger, Eric A. Allamanche, Frank Baumgarte, Dipanjan Sen, Joshua D. Atkins, Juha O. Merimaa
-
Patent number: 11843932Abstract: A device and method for backward compatibility for virtual reality (VR), mixed reality (MR), augmented reality (AR), computer vision, and graphics systems. The device and method enable rendering audio data with more degrees of freedom on devices that support fewer degrees of freedom. The device includes memory configured to store audio data representative of a soundfield captured at a plurality of capture locations, metadata that enables the audio data to be rendered to support N degrees of freedom, and adaptation metadata that enables the audio data to be rendered to support M degrees of freedom. The device also includes one or more processors coupled to the memory, and configured to adapt, based on the adaptation metadata, the audio data to provide the M degrees of freedom, and generate speaker feeds based on the adapted audio data.Type: GrantFiled: May 24, 2021Date of Patent: December 12, 2023Assignee: QUALCOMM IncorporatedInventors: Moo Young Kim, Nils Günther Peters, S M Akramus Salehin, Siddhartha Goutham Swaminathan, Dipanjan Sen
-
Publication number: 20230396921Abstract: A multi-radius spherical microphone that includes an inner body defining an inner sphere having an inner radius from a center; a plurality of inner microphones coupled to the inner spherical body and defining an array of inner microphones; an outer body defining an dodecahedron, wherein the inner body and the outer body are concentric about the center; and a plurality of outer microphones coupled to the outer body at respective vertices of the dodecahedron and defining an array of outer microphones, wherein each of the plurality of outer microphones is positioned radially equidistant from the center.Type: ApplicationFiled: May 22, 2023Publication date: December 7, 2023Applicant: APPLE INC.Inventors: Abhaya Parthy, Dipanjan Sen, Bonnie W. Tom, Jonathan D. Sheaffer, Justin D. Crosby, Symeon Delikaris Manias, Emily A. Wigley
-
Publication number: 20230360655Abstract: Encoding and decoding of higher order ambisonics, HOA, data for purposes of bitrate reduction. One aspect uses principal components analysis to produce spatial descriptors. Other aspects include various spatial descriptor quantization techniques.Type: ApplicationFiled: August 13, 2021Publication date: November 9, 2023Inventors: Moo Young KIM, Sina ZAMANI, Dipanjan SEN
-
Publication number: 20230360660Abstract: Disclosed are methods and systems for decoding immersive audio content encoded by an adaptive number of scene elements for channels, audio objects, higher-order ambisonics (HOA), and/or other sound field representations. The decoded audio is rendered to the speaker configuration of a playback device. For bit streams that represent audio scenes with a different mixture of channels, objects, and/or HOA in consecutive frames, fade-in of the new frame and fade-out of the old frame may be performed. Crossfading between consecutive frames happen in the speaker layout after rendering, in the spatially decoded content type before rendering, or between the transport channels as the output of the baseline decoder but before spatial decoding and rendering. Crossfading may use an immediate fade-in and fade-out frame (IFFF) for the transition frame or may use an overlap-add synthesis technique such as time-domain aliasing cancellation (TDAC) of MDCT.Type: ApplicationFiled: September 10, 2021Publication date: November 9, 2023Inventors: Moo Young KIM, Dipanjan SEN, Eric ALLAMANCHE, J. Kevin Calhoun, Frank BAUMGARTE, Sina ZAMANI, Eric DAY
-
Publication number: 20230360661Abstract: Disclosed is a hierarchical spatial resolution codec that adaptively adjusts the representations of immersive audio content as the target bandwidth for delivering the audio content changes. The audio content may be represented by an adaptive number of content types such as channels/objects, higher-order ambisonics (HOA), and encoded by adaptive spatial coding techniques to support the target bitrate of a transmission channel or user. Adaptive spatial coding techniques may include adaptive channel/object spatial encoding techniques to generate an adaptive number of channels/objects, and adaptive HOA spatial encoding or HOA compression techniques to generate an adaptive order of the HOA. The adaptation may be a function of the target bitrate that is associated with a desired quality, and an analysis that determines the priority of the channels, objects, and HOA. High priority channels/objects may be encoded into a high quality bit-stream while low priority channels/objects may be converted and encoded as HOA.Type: ApplicationFiled: August 31, 2021Publication date: November 9, 2023Inventors: Dipanjan SEN, Moo Young KIM, Frank BAUMGARTE, Sina ZAMANI, Aram LINDAHL
-
Publication number: 20230283977Abstract: A data structure stored in memory includes a scene description that defines a hierarchy of scene components that are in digital audio content received from a producer. The hierarchy has several stages including a fourth stage in which a scene composition is defined that contains all scene components needed to render the digital audio content in a single presentation, for instance as intended by the producer, and for input to a spatial audio renderer, wherein the scene composition contains one or more composition selection groups. Other aspects are also described and claimed.Type: ApplicationFiled: February 14, 2023Publication date: September 7, 2023Inventors: Frank BAUMGARTE, Moo Young KIM, Dipanjan SEN, Sang Uk RYU
-
Patent number: 11664035Abstract: A device configured to decode a bitstream, where the device includes a memory configured to store a temporally encoded representation of spatial audio signals. The device is also configured to receive the bitstream that includes an indication of a spatial transformation, and includes a temporal decoding unit, coupled to the memory, configured to decode one or more spatial audio signals represented in a spatial domain, where the one or more spatial audio signals are associated with different angles in the spatial domain. In addition, the device includes an inverse spatial transformation unit, coupled to the temporal decoding unit, is configured to convert the one or more spatial audio signals represented in the spatial domain into at least three ambisonic coefficients that, in part, represent a soundfield in an ambisonics domain, and perform a spatial transformation of the soundfield based on the indication of the spatial transformation received in the bitstream.Type: GrantFiled: October 4, 2021Date of Patent: May 30, 2023Assignee: Qualcomm IncorporatedInventors: Nils Günther Peters, Moo Young Kim, Dipanjan Sen
-
Publication number: 20230104111Abstract: One or more acoustic parameters of a current acoustic environment of a user may be determined based on sensor signals captured by one or more sensors of the device. One or more preset acoustic parameters may be determined based on the one or more acoustic parameters of the current acoustic environment of the user and an acoustic environment of an audio file comprising audio signals that is determined based on the audio signals of the audio file or metadata of the audio file. The audio signals may be spatially rendered by applying spatial filters that include the one or more preset acoustic parameters to the audio signals, resulting in binaural audio signals. The binaural audio signals may be used to drive speakers of a headset. Other aspects are described and claimed.Type: ApplicationFiled: August 19, 2022Publication date: April 6, 2023Inventors: Prateek Murgai, John E. Arthur, Joshua D. Atkins, Juha O. Merimaa, Dipanjan Sen, Brandon J. Rice, Alexander Singh Alvarado, Jonathan D. Sheaffer, Benjamin Bernard, David E. Romblom
-
Patent number: 11430451Abstract: A first layer of data having a first set of Ambisonic audio components can be decoded where the first set of Ambisonic audio components is generated based on ambience and one or more object-based audio signals. A second layer of data is decoded having at least one of the one or more object-based audio signals. One of the object-based audio signals is subtracted from the first set of Ambisonic audio components. The resulting Ambisonic audio components are rendered to generate a first set of audio channels. The one or more object-based audio signals are spatially rendered to generate a second set of audio channels. Other aspects are described and claimed.Type: GrantFiled: September 26, 2019Date of Patent: August 30, 2022Assignee: APPLE INC.Inventors: Dipanjan Sen, Frank Baumgarte, Juha O. Merimaa
-
Publication number: 20220262373Abstract: A first layer of data having a first set of Ambisonic audio components can be decoded where the first set of Ambisonic audio components is generated based on ambience and one or more object-based audio signals. A second layer of data is decoded having at least one of the one or more object-based audio signals. One of the object-based audio signals is subtracted from the first set of Ambisonic audio components. The resulting Ambisonic audio components are rendered to generate a first set of audio channels. The one or more object-based audio signals are spatially rendered to generate a second set of audio channels. Other aspects are described and claimed.Type: ApplicationFiled: May 9, 2022Publication date: August 18, 2022Inventors: Dipanjan Sen, Frank Baumgarte, Juha O. Merimaa
-
Patent number: 11270711Abstract: In general, techniques are described by which to provide priority information for higher order ambisonic (HOA) audio data. A device comprising a memory and a processor may perform the techniques. The memory stores HOA coefficients of the HOA audio data, the HOA coefficients representative of a soundfield. The processor may decompose the HOA coefficients into a sound component and a corresponding spatial component, the corresponding spatial component defining shape, width, and directions of the sound component, and the corresponding spatial component defined in a spherical harmonic domain. The processor may also determine, based on one or more of the sound component and the corresponding spatial component, priority information indicative of a priority of the sound component relative to other sound components of the soundfield, and specify, in a data object representative of a compressed version of the HOA audio data, the sound component and the priority information.Type: GrantFiled: May 6, 2020Date of Patent: March 8, 2022Assignee: Qualcomm IncorproatedInventors: Moo Young Kim, Nils Günther Peters, Shankar Thagadur Shivappa, Dipanjan Sen