Dolby Labs Patent Applications

Dolby Labs patent applications that are pending before the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240089438
    Abstract: According to the present invention, an adaptive scheme is applied to an image encoding apparatus that includes an inter-predictor, an intra-predictor, a transformer, a quantizer, an inverse quantizer, and an inverse transformer, wherein input images are classified into two or more different categories, and two or more modules from among the inter-predictor, the intra-predictor, the transformer, the quantizer, and the inverse quantizer are implemented to perform respective operations in different schemes according to the category to which an input image belongs. Thus, the invention has the advantage of efficiently encoding an image without the loss of important information as compared to a conventional image encoding apparatus which adopts a packaged scheme.
    Type: Application
    Filed: November 21, 2023
    Publication date: March 14, 2024
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Jong Ki HAN, Chan Won SEO, Kwang Hyun CHOI
  • Publication number: 20240080489
    Abstract: A quantization parameter signalling mechanism for both SDR and HDR content in video coding is described using two approaches. The first approach is to send the user-defined QpC table directly in high level syntax. This leads to more flexible and efficient QP control for future codec development and video content coding. The second approach is to signal luma and chroma QPs independently. This approach eliminates the need for QpC tables and removes the dependency of chroma quantization parameter on luma QP.
    Type: Application
    Filed: November 10, 2023
    Publication date: March 7, 2024
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Fangjun PU, Taoran LU, Peng YIN, Sean Thomas MCCARTHY
  • Publication number: 20240079015
    Abstract: Methods for generating an object based audio program, renderable in a personalizable manner, and including a bed of speaker channels renderable in the absence of selection of other program content (e.g., to provide a default full range audio experience). Other embodiments include steps of delivering, decoding, and/or rendering such a program. Rendering of content of the bed, or of a selected mix of other content of the program, may provide an immersive experience. The program may include multiple object channels (e.g., object channels indicative of user-selectable and user-configurable objects), the bed of speaker channels, and other speaker channels. Another aspect is an audio processing unit (e.g., encoder or decoder) configured to perform, or which includes a buffer memory which stores at least one frame (or other segment) of an object based audio program (or bitstream thereof) generated in accordance with, any embodiment of the method.
    Type: Application
    Filed: September 19, 2023
    Publication date: March 7, 2024
    Applicants: Dolby Laboratories Licensing Corporation, DOLBY INTERNATIONAL AB
    Inventors: Sripal S. MEHTA, Thomas ZIEGLER, Giles BAKER, Jeffrey RIEDMILLER, Prinyar SAUNGSOMBOON
  • Publication number: 20240080608
    Abstract: A method of audio processing includes capturing a binaural audio signal, calculating noise reduction gains using a machine learning model, and generating a modified binaural audio signal. The method may further including performing various corrections to the audio to account for video captured by different cameras such as a front camera and a rear camera. The method may further include performing smooth switching of the binaural audio when switching between the front camera and the rear camera. In this manner, noise may be reduced in the binaural audio, and the user perception of the combined video and binaural audio may be improved.
    Type: Application
    Filed: December 14, 2021
    Publication date: March 7, 2024
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Yuanxing MA, Zhiwei SHUANG, Yang LIU
  • Publication number: 20240079019
    Abstract: Computer-implemented methods for training a neural network, as well as for implementing audio encoders and decoders via trained neural networks, are provided. The neural network may receive an input audio signal, generate an encoded audio signal and decode the encoded audio signal. A loss function generating module may receive the decoded audio signal and a ground truth audio signal, and may generate a loss function value corresponding to the decoded audio signal. Generating the loss function value may involve applying a psychoacoustic model. The neural network may be trained based on the loss function value. The training may involve updating at least one weight of the neural network.
    Type: Application
    Filed: November 13, 2023
    Publication date: March 7, 2024
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Roy M. FEJGIN, Grant A. DAVIDSON, Chih-Wei WU, Vivek KUMAR
  • Publication number: 20240080479
    Abstract: Sampled data is packaged in checkerboard format for encoding and decoding. The sampled data may be quincunx sampled multi-image video data (e.g., 3D video or a multi-program stream), and the data may also be divided into sub-images of each image which are then multiplexed, or interleaved, in frames of a video stream to be encoded and then decoded using a standardized video encoder. A system for viewing may utilize a standard video decoder and a formatting device that de-interleaves the decoded sub-images of each frame reformats the images for a display device. A 3D video may be encoded using a most advantageous interleaving format such that a preferred quality and compression ratio is reached. In one embodiment, the invention includes a display device that accepts data in multiple formats.
    Type: Application
    Filed: November 7, 2023
    Publication date: March 7, 2024
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Alexandros Tourapis, Walter J. Husak, Peshala V. Pahalawatta, Athanasios Leontaris
  • Publication number: 20240080465
    Abstract: Methods and systems for frame rate scalability are described. Support is provided for input and output video sequences with variable frame rate and variable shutter angle across scenes, or for input video sequences with fixed input frame rate and input shutter angle, but allowing a decoder to generate a video output at a different output frame rate and shutter angle than the corresponding input values. Techniques allowing a decoder to decode more computationally-efficiently a specific backward compatible target frame rate and shutter angle among those allowed are also presented.
    Type: Application
    Filed: November 13, 2023
    Publication date: March 7, 2024
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Robin Atkins, Peng Yin, Taoran Lu, Fangjun Pu, Sean Thomas McCarthy, Walter J. Husak, Tao Chen, Guan-Ming Su
  • Publication number: 20240073459
    Abstract: In a method to improve backwards compatibility when decoding high-dynamic range images coded in a wide color gamut (WCG) space which may not be compatible with legacy color spaces, hue and/or saturation values of images in an image database are computed for both a legacy color space (say, YCbCr-gamma) and a preferred WCG color space (say, IPT-PQ). Based on a cost function, a reshaped color space is computed so that the distance between the hue values in the legacy color space and rotated hue values in the preferred color space is minimized HDR images are coded in the reshaped color space. Legacy devices can still decode standard dynamic range images assuming they are coded in the legacy color space, while updated devices can use color reshaping information to decode HDR images in the preferred color space at full dynamic range.
    Type: Application
    Filed: October 31, 2023
    Publication date: February 29, 2024
    Applicant: DOLBY LABORATORIES LICENSING CORPORATION
    Inventors: Robin Atkins, Peng Yin, Taoran Lu, Jaclyn Anne Pytlarz
  • Publication number: 20240073357
    Abstract: Dual or multi-modulation display systems comprising a first modulator and a second modulator are disclosed. The first modulator may comprise a plurality of analog mirrors (e.g. MEMS array) and the second modulator may comprise a plurality of mirrors (e.g., DMD array). The display system may further comprise a controller that sends control signals to the first and second modulator. The display system may render highlight features within a projected image by affecting a time multiplexing scheme. In one embodiment, the first modulator may be switched on a sub-frame basis such that a desired proportion of the available light may be focused or directed onto the second modulator to form the highlight feature on a sub-frame rendering basis.
    Type: Application
    Filed: January 31, 2022
    Publication date: February 29, 2024
    Applicant: DOLBY LABORATORIES LICENSING CORPORATION
    Inventor: Martin J. Richards
  • Publication number: 20240071411
    Abstract: Disclosed is a method for determining one or more dialog quality metrics of a mixed audio signal comprising a dialog component and a noise component, the method comprising separating an estimated dialog component from the mixed audio signal by means of a dialog separator using a dialog separating model determined by training the dialog separator based on the one or more quality metrics; providing the estimated dialog component from the dialog separator to a quality metrics estimator; and determining the one or more quality metrics by means of the quality metrics estimator based on the mixed signal and the estimated dialog component. Further disclosed is a method for training a dialog separator, a system comprising circuitry configured to perform the method, and a non-transitory computer-readable storage medium.
    Type: Application
    Filed: January 4, 2022
    Publication date: February 29, 2024
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Jundai SUN, Lie LU, Shaofan YANG, Rhonda J. WILSON, Dirk Jeroen BREEBAART
  • Publication number: 20240073444
    Abstract: A video encoding method according to an embodiment of the present invention includes generating header information that includes information about resolutions of motion vectors of respective blocks, determined based on motion prediction for a unit image. Here, the header information includes flag information indicating whether resolutions of all motion vectors included in the unit image are integer-pixel resolutions. Further, a video decoding method according to another embodiment of the present invention includes extracting information about resolutions of motion vectors of each unit image from header information included in a target bitstream to be decoded; and a decoding unit for decoding the unit image based on the resolution information. Here, the header information includes flag information indicating whether resolutions of all motion vectors included in the unit image are integer-pixel resolutions.
    Type: Application
    Filed: November 8, 2023
    Publication date: February 29, 2024
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Jong Ki HAN, Jae Yung LEE
  • Publication number: 20240062765
    Abstract: Encoding and decoding devices for encoding the channels of an audio system having at least four channels are disclosed. The decoding device has a first stereo decoding component which subjects a first pair of input channels to a first stereo decoding, and a second stereo decoding component which subjects a second pair of input channels to a second stereo decoding. The results of the first and second stereo decoding components are crosswise coupled to a third and a fourth stereo decoding component which each performs stereo decoding on one channel resulting from the first stereo decoding component, and one channel resulting from the second stereo decoding component.
    Type: Application
    Filed: September 1, 2023
    Publication date: February 22, 2024
    Applicant: DOLBY INTERNATIONAL AB
    Inventors: Kristofer KJOERLING, Harald MUNDT, Heiko PURNHAGEN
  • Publication number: 20240056610
    Abstract: Methods are described to communicate source color volume information in a coded bitstream using SEI messaging. Such data include at least the minimum, maximum, and average luminance values in the source data plus optional data that may include the color volume x and y chromaticity coordinates for the input color primaries (e.g., red, green, and blue) of the source data, and the color x and y chromaticity coordinates for the color primaries corresponding to the minimum, average, and maximum luminance values in the source data. Messaging data signaling an active region in each picture may also be included.
    Type: Application
    Filed: October 13, 2023
    Publication date: February 15, 2024
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Tao CHEN, Peng YIN, Taoran LU, Walter J. HUSAK
  • Publication number: 20240056755
    Abstract: Improved methods and/or apparatus for decoding an encoded audio signal in soundfield format for L loudspeakers. The method and/or apparatus can render an Ambisonics format audio signal to 2D loudspeaker setup(s) based on a rendering matrix. The rendering matrix has elements based on loudspeaker positions and wherein the rendering matrix is determined based on weighting at least an element of a first matrix with a weighting factor ? = 1 L . The first matrix is determined based on positions of the L loudspeakers and at least a virtual position of at least a virtual loudspeaker that is added to the positions of the L loudspeakers.
    Type: Application
    Filed: August 28, 2023
    Publication date: February 15, 2024
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Florian KEILER, Johannes Boehm
  • Publication number: 20240056760
    Abstract: A method of audio processing includes performing spatial analysis on a binaural signal to estimate level differences and phase differences characteristic of a binaural filter of the binaural signal, performing object extraction on the binaural audio signal using the estimated level and phase differences to generate a left/right main component signal and a left/right residual component signal. The system may process the left/right main and left/right residual components differently using different object processing parameters for e.g. repositioning, equalization, compression, upmixing, channel remapping or storage to generate a processed binaural signal that provides an improved listening experience. Repositioning may be based on head tracking sensor data.
    Type: Application
    Filed: December 16, 2021
    Publication date: February 15, 2024
    Applicants: Dolby Laboratories Licensing Corporation, Dolby International AB
    Inventors: Dirk Jeroen BREEBAART, Giulio CENGARLE, C. Phillip BROWN
  • Publication number: 20240055006
    Abstract: Described herein is a method for setting up a decoder for generating processed audio data from an audio bitstream, the decoder comprising a Generator of a Generative Adversarial Network, GAN, for processing of the audio data, wherein the method includes the steps of (a) pre-configuring the Generator for processing of audio data with a set of parameters for the Generator, the parameters being determined by training, at training time, the Generator using the full concatenated distribution; and (b) pre-configuring the decoder to determine, at decoding time, a truncation mode for modifying the concatenated distribution and to apply the determined truncation mode to the concatenated distribution. Described are further a method of generating processed audio data from an audio bitstream using a Generator of a Generative Adversarial Network, GAN, for processing of the audio data and a respective apparatus. Moreover, described are also respective systems and computer program products.
    Type: Application
    Filed: December 15, 2021
    Publication date: February 15, 2024
    Applicant: Dolby International AB
    Inventor: Arijit BISWAS
  • Publication number: 20240056757
    Abstract: Some methods may involve receiving a first content stream that includes first audio signals, rendering the first audio signals to produce first audio playback signals, generating first direct sequence spread spectrum (DSSS) signals, generating first modified audio playback signals by inserting the first DSSS signals into the first audio playback signals, and causing a loudspeaker system to play back the first modified audio playback signals, to generate first audio device playback sound. The method(s) may involve receiving microphone signals corresponding to at least the first audio device playback sound and to second through Nth audio device playback sound corresponding to second through Nth modified audio playback signals (including second through Nth DSSS signals) played back by second through Nth audio devices, extracting second through Nth DSSS signals from the microphone signals and estimating at least one acoustic scene metric based, at least partly, on the second through Nth DSSS signals.
    Type: Application
    Filed: December 2, 2021
    Publication date: February 15, 2024
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Benjamin John SOUTHWELL, David GUNAWAN, Mark R.P. THOMAS, Christopher Graham HINES
  • Publication number: 20240056613
    Abstract: Disclosed is an encoding/decoding method and apparatus related to adaptive deblocking filtering. There is provided an image decoding method performing adaptive filtering in inter-prediction, the method including: reconstructing, from a bitstream, an image signal including a reference block on which block matching is performed in inter-prediction of a current block to be encoded; obtaining, from the bitstream, a flag indicating whether the reference block exists within a current picture where the current block is positioned; reconstructing the current block by using the reference block; adaptively applying an in-loop filter for the reconstructed current block based on the obtained flag; and storing the current block to which the in-loop filter is or is not applied in a decoded picture buffer (DPB).
    Type: Application
    Filed: October 24, 2023
    Publication date: February 15, 2024
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Je Chang JEONG, Ki Baek KIM
  • Publication number: 20240055010
    Abstract: An apparatus and method are disclosed for processing an audio signal. The apparatus includes an input interface, a digital filterbank having an analysis part and a synthesis part, a first phase shifter, a spectral envelope adjuster, a second phase shifter, and an output interface. The first phase shifter and the second phase shifter reduce a complexity of the digital filterbank, which includes both analysis and synthesis filters that are complex-exponential modulated versions of a prototype filter.
    Type: Application
    Filed: August 21, 2023
    Publication date: February 15, 2024
    Applicant: DOLBY INTERNATIONAL AB
    Inventor: Per EKSTRAND
  • Publication number: 20240056649
    Abstract: A method for delivering media content to one or more clients over a distributed system is disclosed. The method may include generating a plurality of network-coded symbols from a plurality of original symbols representing a first media asset. The method may further include generating an original plurality of coded variants of the first media asset. The method may further include distributing a first coded variant of the original plurality of coded variants to a first cache on a first server device for storage in the first cache. The method may further include distributing a second coded variant of the original plurality of coded variants to a second cache on a second server device for storage in the second cache.
    Type: Application
    Filed: December 16, 2021
    Publication date: February 15, 2024
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Jeffrey RIEDMILLER, Mingchao YU, Jason Michael CLOUD
  • Publication number: 20240048931
    Abstract: Some methods may involve receiving a first content stream that includes first audio signals, rendering the first audio signals to produce first audio playback signals, generating first direct sequence spread spectrum (DSSS) signals, generating first modified audio playback signals by inserting the first DSSS signals into the first audio playback signals, and causing a loudspeaker system to play back the first modified audio playback signals, to generate first audio device playback sound. The method(s) may involve receiving microphone signals corresponding to at least the first audio device playback sound and to second through Nth audio device playback sound corresponding to second through Nth modified audio playback signals (including second through Nth DSSS signals) played back by second through Nth audio devices, extracting second through Nth DSSS signals from the microphone signals and estimating at least one acoustic scene metric based, at least partly, on the second through Nth DSSS signals.
    Type: Application
    Filed: December 2, 2021
    Publication date: February 8, 2024
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Benjamin John SOUTHWELL, David GUNAWAN, Mark R.P. THOMAS, Christopher Graham HINES
  • Publication number: 20240048932
    Abstract: An apparatus and method of generating personalized HRTFs. The system is prepared by calculating a model for HRTFs described as the relationship between a finite example set of input data, namely anthropometric measures and demographic information for a set of individuals, and a corresponding set of output data, namely HRTFs numerically simulated using a high-resolution database of 3D scans of the same set of individuals. At the time of use, the system queries the user for their demographic information, and then from a series of images of the user, the system detects and measures various anthropometric characteristics. The system then applies the prepared model to the anthropometric and demographic data as part of generating a personalized HRTF. In this manner, the personalized HRTF can be generated with more convenience than by performing a high-resolution scan or an acoustic measurement of the user, and with less computational complexity than by numerically simulating their HRTF.
    Type: Application
    Filed: August 24, 2023
    Publication date: February 8, 2024
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: McGregor Steele JOYNER, Alex BRANDMEYER, Scott DALY, Jeffrey Ross BAKER, Andrea FANELLI, Poppy Anne Carrie CRUM
  • Publication number: 20240046940
    Abstract: The invention provides an efficient implementation of cross-product enhanced high-frequency reconstruction (HFR), wherein a new component at frequency Q?+r?0 is generated on the basis of existing components at ? and ?+?0. The invention provides a block-based harmonic transposition, wherein a time block of complex subband samples is processed with a common phase modification. Superposition of several modified samples has the net effect of limiting undesirable intermodulation products, thereby enabling a coarser frequency resolution and/or lower degree of oversampling to be used. In one embodiment, the invention further includes a window function suitable for use with block-based cross-product enhanced HFR. A hardware embodiment of the invention may include an analysis filter bank, a subband processing unit configurable by control data and a synthesis filter bank.
    Type: Application
    Filed: October 5, 2023
    Publication date: February 8, 2024
    Applicant: DOLBY INTERNATIONAL AB
    Inventor: Lars Villemoes
  • Publication number: 20240039499
    Abstract: Volume leveler controller and controlling method are disclosed. In one embodiment, A volume leveler controller includes an audio content classifier for identifying the content type of an audio signal in real time; and an adjusting unit for adjusting a volume leveler in a continuous manner based on the content type as identified. The adjusting unit may configured to positively correlate the dynamic gain of the volume leveler with informative content types of the audio signal, and negatively correlate the dynamic gain of the volume leveler with interfering content types of the audio signal.
    Type: Application
    Filed: July 20, 2023
    Publication date: February 1, 2024
    Applicant: DOLBY LABORATORIES LICENSING CORPORATION
    Inventors: Jun WANG, Lie LU, Alan J. SEEFELDT
  • Publication number: 20240040327
    Abstract: The invention discloses rendering sound field signals, such as Higher-Order Ambisonics (HOA), for arbitrary loudspeaker setups, where the rendering results in highly improved localization properties and is energy preserving. This is obtained by rendering an audio sound field representation for arbitrary spatial loudspeaker setups and/or by a a decoder that decodes based on a decode matrix (D). The decode matrix (D) is based on smoothing and scaling of a first decode matrix {circumflex over (D)} with smoothing coefficients. The first decode matrix {circumflex over (D)} is based on a mix matrix G and a mode matrix {tilde over (?)}, where the mix matrix G was determined based on L speakers and positions of a spherical modelling grid related to a HOA order N, and the mode matrix {tilde over (?)} was determined based on the spherical modelling grid and the HOA order N.
    Type: Application
    Filed: July 26, 2023
    Publication date: February 1, 2024
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Johannes BOEHM, Florian KEILER
  • Publication number: 20240038248
    Abstract: Encoding/decoding an audio signal having one or more audio components, wherein each audio component is associated with a spatial location. A first audio signal presentation (z) of the audio components, a first set of transform parameters (w(f)), and signal level data (?2) are encoded and transmitted to the decoder. The decoder uses the first set of transform parameters (w(f)) to form a reconstructed simulation input signal intended for an acoustic environment simulation, and applies a signal level modification (?) to the reconstructed simulation input signal. The signal level modification is based on the signal level data (?2) and data (p2) related to the acoustic environment simulation. The attenuated reconstructed simulation input signal is then processed in an acoustic environment simulator. With this process, the decoder does not need to determine the signal level of the simulation input signal, thereby reducing processing load.
    Type: Application
    Filed: August 7, 2023
    Publication date: February 1, 2024
    Applicant: Dolby Laboratories Licensing Corporation
    Inventor: Dirk Jeroen BREEBAART
  • Publication number: 20240038258
    Abstract: A method of audio content identification includes using a two-stage classifier. The first stage includes previously-existing classifiers and the second stage includes a new classifier. The outputs of the first and second stages calculated over different time periods are combined to generate a steering signal. The final classification results from a combination of the steering signal and the outputs of the first and second stages. In this manner, a new classifier may be added without disrupting existing classifiers.
    Type: Application
    Filed: August 18, 2021
    Publication date: February 1, 2024
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Guiping Wang, Lie Lu
  • Publication number: 20240040043
    Abstract: Disclosed is a method for managing acoustic feedback in real-time audio communications in a communications system, the method comprising determining, by means of a detection module, whether a first communication device is in loudspeaker mode, whether the first communication device is in real-time audio communications with a second communication, and whether the first communication device and the second communication device are in a same acoustic space. Upon determining that this is the case a request signal for requesting one or more measures against acoustic feedback is provided to a mitigation module. Further disclosed are a device and a system configured to perform the method, a non-transitory computer-readable medium, an encoder and a decoder.
    Type: Application
    Filed: December 22, 2021
    Publication date: February 1, 2024
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Qianqian FANG, Kai LI, Yanmeng GUO, Wei HUANG, Yang LIU
  • Publication number: 20240031587
    Abstract: Methods and systems for frame rate scalability are described. Support is provided for input and output video sequences with variable frame rate and variable shutter angle across scenes, or for input video sequences with fixed input frame rate and input shutter angle, but allowing a decoder to generate a video output at a different output frame rate and shutter angle than the corresponding input values. Techniques allowing a decoder to decode more computationally-efficiently a specific backward compatible target frame rate and shutter angle among those allowed are also presented.
    Type: Application
    Filed: September 28, 2023
    Publication date: January 25, 2024
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Robin Atkins, Peng Yin, Taoran Lu, Fangjun Pu, Sean Thomas McCarthy, Walter J. Husak, Tao Chen, Guan-Ming Su
  • Publication number: 20240031768
    Abstract: The positions of a plurality of speakers at a media consumption site are determined. Audio information in an object-based format is received. Gain adjustment value for a sound content portion in the object-based format may be determined based on the position of the sound content portion and the positions of the plurality of speakers. Audio information in a ring-based channel format is received. Gain adjustment value for each ring-based channel in a set of ring-based channels may be determined based on the ring to which the ring-based channel belongs and the positions of the speakers at a media consumption site.
    Type: Application
    Filed: July 15, 2023
    Publication date: January 25, 2024
    Applicants: DOLBY LABORATORIES LICENSING CORPORATION, DOLBY INTERNATIONAL AB
    Inventors: Nicolas R. TSINGOS, David S. MCGRATH, Freddie SANCHEZ, Antonio MATEOS SOLE
  • Publication number: 20240031760
    Abstract: A method (900) for rendering audio in a virtual reality rendering environment (180) is described. The method (900) comprises rendering (901) an origin audio signal of an origin audio source (113) of an origin audio scene (111) from an origin source position on a sphere (114) around a listening position (201) of a listener (181). Furthermore, the method (900) comprises determining (902) that the listener (181) moves from the listening position (201) within the origin audio scene (111) to a listening position (202) within a different destination audio scene (112). In addition, the method (900) comprises applying (903) a fade-out gain to the origin audio signal to determine a modified origin audio signal, and rendering (903) the modified origin audio signal of the origin audio source (113) from the origin source position on the sphere (114) around the listening position (201, 202).
    Type: Application
    Filed: July 24, 2023
    Publication date: January 25, 2024
    Applicant: Dolby International AB
    Inventors: Leon Terentiv, Christof Joseph Fersch, Daniel Fischer
  • Publication number: 20240031543
    Abstract: In one embodiment, methods, media, and systems process and display light field images using a view function that is based on pixel locations in the image and on the viewer's distance (observer's Z position) from the display. The view function can be an angular view function that specifies different angular views for different pixels in the light field image based on the inputs that can include: the x or y pixel location in the image, the viewer's distance from the display, and the viewer's angle relative to the display. In one embodiment, light field metadata, such as angular range metadata and/or angular offset metadata can be used to process and display the image. In one embodiment, color volume mapping metadata can be used to adjust color volume mapping based on the determined angular views; and the color volume mapping metadata can also be adjusted based on angular offset metadata.
    Type: Application
    Filed: December 2, 2021
    Publication date: January 25, 2024
    Applicant: DOLBY LABORATORIES LICENSING CORPORATION
    Inventor: Robin ATKINS
  • Publication number: 20240029748
    Abstract: A method (600) for decoding an encoded audio signal (102) is described. The encoded audio signal (102) comprises a sequence of frames. Furthermore, the encoded audio signal (102) is indicative of a plurality of different dynamic range control (DRC) profiles for a corresponding plurality of different rendering modes. Different subsets of DRC profiles from the plurality of DRC profiles are comprised within different frames of the sequence of frames, such that two or more frames of the sequence of frames jointly comprise the plurality of DRC profiles.
    Type: Application
    Filed: August 14, 2023
    Publication date: January 25, 2024
    Applicant: DOLBY INTERNATIONAL AB
    Inventors: Holger HOERICH, Jeroen KOPPENS
  • Publication number: 20240029747
    Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag.
    Type: Application
    Filed: July 24, 2023
    Publication date: January 25, 2024
    Applicant: Dolby International AB
    Inventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
  • Publication number: 20240022224
    Abstract: In an embodiment, a method comprises: filtering reference audio content items to separate the reference audio content items into different frequency bands; for each frequency band, extracting a first feature vector from at least a portion of each of the reference audio content items, wherein the first feature vector includes at least one audio characteristic of the reference audio content items; obtaining at least one semantic label from at least a portion of each of the reference audio content items; obtaining a second feature vector consisting of the first feature vectors per frequency band and the at least one semantic label; generating, based on the second feature vector, cluster feature vectors representing centroids of clusters; separating the reference audio content items according to the cluster feature vectors; and computing an average target profile for each cluster based on the reference audio content items in the cluster.
    Type: Application
    Filed: November 18, 2021
    Publication date: January 18, 2024
    Applicants: Dolby Laboratories Licensing Corporation, DOLBY INTERNATIONAL AB
    Inventors: Giulio CENGARLE, Nicholas Laurence ENGEL, Patrick Winfrey SCANNELL, Davide SCAINI
  • Publication number: 20240022869
    Abstract: A method may involve: receiving direction of arrival (DOA) data corresponding to sound emitted by at least a first smart audio device of the audio environment that includes a first audio transmitter and a first audio receiver, the DOA data corresponding to sound received by at least a second smart audio device of the audio environment that includes a second audio transmitter and a second audio receiver, the DOA data corresponding to sound emitted by at least the second smart audio device and received by at least the first smart audio device; receiving one or more configuration parameters corresponding to the audio environment, to one or more audio devices, or both; and minimizing a cost function based at least in part on the DOA data and the configuration parameter(s), to estimate a position and an orientation of at least the first smart audio device and the second smart audio device.
    Type: Application
    Filed: December 2, 2021
    Publication date: January 18, 2024
    Applicants: Dolby Laboratories Licensing Corporation, Dolby International AB
    Inventors: Daniel ARTEAGA, Davide SCAINI, Mark R.P. THOMAS, Avery BRUNI, Olha Michelle TOWNSEND
  • Publication number: 20240022868
    Abstract: Described herein is a method for training a machine learning algorithm. The method may comprise receiving a first input multichannel audio signal. The method may comprise generating, using the machine learning algorithm, an intermediate audio signal based on the first input multichannel audio signal. The method may comprise rendering the intermediate audio signal into a first output multichannel audio signal. Further, the method may comprise improving the machine learning algorithm based on a difference between the first input multichannel audio signal and the first output multichannel audio signal. Described herein are further an apparatus for generating an intermediate audio format from an input multichannel audio signal as well as a respective computer program product comprising a computer-readable storage medium with instructions adapted to carry out said method when executed by a device having processing capability.
    Type: Application
    Filed: October 14, 2021
    Publication date: January 18, 2024
    Applicant: DOLBY INTERNATIONAL AB
    Inventors: Daniel Arteaga, Jordi Pons Puig
  • Publication number: 20240018844
    Abstract: On the basis of a bitstream (P), an n-channel audio signal (X) is reconstructed by deriving an m-channel core signal (Y) and multichannel coding parameters (a) from the bitstream, where 1?m<n. Also derived from the bitstream are pre-processing dynamic range control, DRC, parameters (DRC2) quantifying an encoder-side dynamic range limiting of the core signal. The n-channel audio signal is obtained by parametric synthesis in accordance with the multichannel coding parameters and while cancelling any encoder-side dynamic range limiting based on the pre-processing DRC parameters. In particular embodiments, the reconstruction further includes use of compensated post-processing DRC parameters quantifying a potential decoder-side dynamic range compression. Cancellation of an encoder-side range limitation and range compression are preferably performed by different decoder-side components. Cancellation and compression may be coordinated by a DRC pre-processor.
    Type: Application
    Filed: July 19, 2023
    Publication date: January 18, 2024
    Applicants: DOLBY LABORATORIES LICENSING CORPORATION, DOLBY INTERNATIONAL AB
    Inventors: Jeffrey RIEDMILLER, Karl J. ROEDEN, Kristofer KJOERLING, Heiko PURNHAGEN, Vinay MELKOTE, Leif SEHLSTROM
  • Publication number: 20240021210
    Abstract: Described herein is a method of processing an audio signal using a deep-learning-based generator, wherein the method includes the steps of: (a) inputting the audio signal into the generator for processing the audio signal; (b) mapping a time segment of the audio signal to a latent feature space representation, using an encoder stage of the generator; (c) upsampling the latent feature space representation using a decoder stage of the generator, wherein at least one layer of the decoder stage applies sinusoidal activation; and (d) obtaining, as an output from the decoder stage of the generator, a processed audio signal. Described are further a method for training said generator and respective apparatus, systems and computer program products.
    Type: Application
    Filed: October 15, 2021
    Publication date: January 18, 2024
    Applicant: DOLBY INTERNATIONAL AB
    Inventor: Arijit BISWAS
  • Publication number: 20240015434
    Abstract: Disclosed are methods and systems which convert a multi-microphone input signal to a multichannel output signal making use of a time- and frequency-varying matrix. For each time and frequency tile, the matrix is derived as a function of a dominant direction of arrival and a steering strength parameter. Likewise, the dominant direction and steering strength parameter are derived from characteristics of the multi-microphone signals, where those characteristics include values representative of the inter-channel amplitude and group-delay differences.
    Type: Application
    Filed: July 13, 2023
    Publication date: January 11, 2024
    Applicant: DOLBY LABORATORIES LICENSING CORPORATION
    Inventor: David S. MCGRATH
  • Publication number: 20240015315
    Abstract: Methods and systems for frame rate scalability are described. Support is provided for input and output video sequences with variable frame rate and variable shutter angle across scenes, or for input video sequences with fixed input frame rate and input shutter angle, but allowing a decoder to generate a video output at a different output frame rate and shutter angle than the corresponding input values. Techniques allowing a decoder to decode more computationally-efficiently a specific backward compatible target frame rate and shutter angle among those allowed are also presented.
    Type: Application
    Filed: June 13, 2023
    Publication date: January 11, 2024
    Applicant: DOLBY LABORATORIES LICENSING CORPORATION
    Inventors: Robin Atkins, Peng Yin, Taoran Lu, Fangjun Pu, Sean Thomas McCarthy, Walter J. Husak, Tao Chen, Guan-Ming Su
  • Publication number: 20240013793
    Abstract: Method for encoding scene-based audio is provided. In some implementations, the method involves determining, by an encoder, a spatial direction of a dominant sound component in a frame of an input audio signal. In some implementations, the method involves determining rotation parameters based on the determined spatial direction and a direction preference of a coding scheme to be used to encode the input audio signal. In some implementations, the method involves rotating sound components of the frame based on the rotation parameters such that, after being rotated, the dominant sound component has a spatial direction that aligns with the direction preference of the coding scheme. In some implementations, the method involves encoding the rotated sound components of the frame of the input audio signal using the coding scheme in connection with an indication of the rotation parameters or an indication of the spatial direction of the dominant sound component.
    Type: Application
    Filed: December 2, 2021
    Publication date: January 11, 2024
    Applicants: Dolby Laboratories Licensing Corporation, DOLBY INTERNATIONAL AB
    Inventors: Stefan BRUHN, Harald MUNDT, David S. MCGRATH, Stefanie BROWN
  • Publication number: 20240013797
    Abstract: The present disclosure provides a decoder configured to receive a finite bitrate stream that includes a quantized latent frame, where the quantized latent frame includes a quantized representation of a current frame of a signal in a latent domain different from a first domain; to generate a reconstructed latent frame from the quantized latent frame; to use a generative neural network model to perform a task for which the general neural network model has been trained, wherein the task includes to generate parameters for an invertible mapping from the latent domain to the first domain; to reconstruct a current frame of the signal in the first domain, which includes to map the reconstructed latent frame to the first domain by use of the invertible mapping, and to use the reconstructed current frame of the signal in the first domain to update a state of the generative neural network model.
    Type: Application
    Filed: October 11, 2021
    Publication date: January 11, 2024
    Applicant: DOLBY INTERNATIONAL AB
    Inventors: Janusz KLEJSA, Lars VILLEMOES, Per HEDELIN
  • Publication number: 20240013799
    Abstract: In some embodiments, a method, comprises: dividing, using at least one processor, an audio input into speech and non-speech segments; for each frame in each non-speech segment, estimating, using the at least one processor, a time-varying noise spectrum of the non-speech segment; for each frame in each speech segment, estimating, using the at least one processor, speech spectrum of the speech segment; for each frame in each speech segment, identifying one or more non-speech frequency components in the speech spectrum; comparing the one or more non-speech frequency components with one or more corresponding frequency components in a plurality of estimated noise spectra and selecting the estimated noise spectrum from the plurality of estimated noise spectra based on a result of the comparing.
    Type: Application
    Filed: September 21, 2021
    Publication date: January 11, 2024
    Applicants: Dolby Laboratories Licensing Corporation, DOLBY INTERNATIONAL AB
    Inventors: Davide Scaini, Chunghsin Yeh, Giulio Cengarle, Mark David de Burgh
  • Publication number: 20240005942
    Abstract: Described is a method of training a deep-learning-based system for sound source separation. The system comprises a separation stage for frame-wise extraction of representations of sound sources from a representation of an audio signal, and a clustering stage for generating, for each frame, a vector indicative of an assignment permutation of extracted frames of representations of sound sources to respective sound sources. The representation of the audio signal is a waveform-based representation. The separation stage is trained using frame-level permutation invariant training. Further, the clustering stage is trained to generate embedding vectors for the frames of the audio signal that allow to determine estimates of respective assignment permutations between extracted sound signals and labels of sound sources that had been used for the frames. Also described is a method of using the deep-learning-based system for sound source separation.
    Type: Application
    Filed: October 13, 2021
    Publication date: January 4, 2024
    Applicants: Dolby Laboratories Licensing Corporation, DOLBY INTERNATIONAL AB
    Inventors: Xiaoyu LIU, Jordi PONS PUIG
  • Publication number: 20240007813
    Abstract: A method for compressing a HOA signal being an input HOA representation with input time frames (C(k)) of HOA coefficient sequences comprises spatial HOA encoding of the input time frames and subsequent perceptual encoding and source encoding. Each input time frame is decomposed (802) into a frame of predominant sound signals (XPS(k?1)) and a frame of an ambient HOA component ({tilde over (C)}AMB(k?1)). The ambient HOA component ({tilde over (C)}AMB(k?1)) comprises, in a layered mode, first HOA coefficient sequences of the input HOA representation (cn(k?1)) in lower positions and second HOA coefficient sequences (cAMB,n(k?1)) in remaining higher positions. The second HOA coefficient sequences are part of an HOA representation of a residual between the input HOA representation and the HOA representation of the predominant sound signals.
    Type: Application
    Filed: June 22, 2023
    Publication date: January 4, 2024
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Sven KORDON, Alexander KRUEGER, Oliver WUEBBOLT
  • Publication number: 20240005933
    Abstract: The present document describes a method (700) for encoding a multi-channel input signal (201). The method (700) comprises determining (701) a plurality of downmix channel signals (203) from the multi-channel input signal (201) and performing (702) energy compaction of the plurality of downmix channel signals (203) to provide a plurality of compacted channel signals (404). Furthermore, the method (700) comprises determining (703) joint coding metadata (205) based on the plurality of compacted channel signals (404) and based on the multi-channel input signal (201), wherein the joint coding metadata (205) is such that it allows upmixing of the plurality of compacted channel signals (404) to an approximation of the multi-channel input signal (201). In addition, the method (700) comprises encoding (704) the plurality of compacted channel signals (404) and the joint coding metadata (205).
    Type: Application
    Filed: July 10, 2023
    Publication date: January 4, 2024
    Applicants: DOLBY LABORATORIES LICENSING CORPORATION, DOLBY INTERNATIONAL AB
    Inventors: David S. MCGRATH, Michael ECKERT, Heiko PURNHAGEN, Stefan BRUHN
  • Publication number: 20240007678
    Abstract: In a method to improve the coding efficiency of high-dynamic range (HDR) images, a decoder parses sequence processing set (SPS) data from an input coded bitstream to detect that an HDR extension syntax structure is present in the parsed SPS data. It extracts from the HDR extension syntax structure post-processing information that includes one or more of a color space enabled flag, a color enhancement enabled flag, an adaptive_reshaping_enabled_flag, a dynamic range conversion flag, a color correction enabled flag, or an SDR_viewable_flag. It decodes the input bitstream to generate a preliminary output decoded signal, and generates a second output signal based on the preliminary output signal and the post-processing information.
    Type: Application
    Filed: September 19, 2023
    Publication date: January 4, 2024
    Applicant: DOLBY LABORATORIES LICENSING CORPORATION
    Inventors: Peng YIN, Taoran LU, Fangjun PU, Tao CHEN, Walter J. HUSAK
  • Publication number: 20240007682
    Abstract: An input image of a first bit depth in an input domain is received. Forward reshaping operations are performed on the input image to generate a forward reshaped image of a second bit depth in a reshaping domain. An image container containing image data derived from the forward reshaped image is encoded into an output video signal of the second bit depth.
    Type: Application
    Filed: November 10, 2021
    Publication date: January 4, 2024
    Applicant: Dolby Laboratories Licensing Corporation
    Inventors: Janos HORVATH, Harshad KADU, Guan-Ming SU
  • Publication number: 20230421734
    Abstract: Smaller halftone tiles are implemented on a first modulator of a dual modulation projection system. This techniques uses multiple halftones per frame in the pre-modulator synchronized with a modified bit sequence in the primary modulator to effectively increase the number of levels provided by a given tile size in the halftone modulator. It addresses the issue of reduced contrast ratio at low light levels for small tile sizes and allows the use of smaller PSFs which reduce halo artifacts in the projected image and may be utilized in 3D projecting and viewing.
    Type: Application
    Filed: September 14, 2023
    Publication date: December 28, 2023
    Applicant: DOLBY LABORATORIES LICENSING CORPORATION
    Inventors: Martin J. RICHARDS, Jerome SHIELDS