Guan-Ming Su has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
Abstract: Sequence-level parameters are generated for an image frame sequence including sequence-level indicators for indicating metadata types present for each image frame in the sequence of image frames. Frame-present parameters are generated for a specific image frame in the sequence including frame-present indicators corresponding to the metadata types as indicated in the sequence-level parameters. The frame-present indicators identify first metadata types for which metadata parameter values are to be encoded in a coded bitstream as metadata payloads. The image frame sequence, the sequence-level parameters, the frame-present parameters and the metadata payloads are encoded in the coded bitstream. A recipient device can generate, from the specific image frame based partly on the metadata parameter values determined for the first metadata types, a target display image for a target display.
Abstract: Methods and systems for frame rate scalability are described. Support is provided for input and output video sequences with variable frame rate and variable shutter angle across scenes, or for input video sequences with fixed input frame rate and input shutter angle, but allowing a decoder to generate a video output at a different output frame rate and shutter angle than the corresponding input values. Techniques allowing a decoder to decode more computationally-efficiently a specific backward compatible target frame rate and shutter angle among those allowed are also presented.
Abstract: Novel methods and systems for encoding standard dynamic range video to improve the final quality after converting standard dynamic range video into enhanced dynamic range video are disclosed. A dual layer codec structure that amplifies certain codeword ranges can be used to send enhanced information to the decoder in order to achieve an enhanced (higher bit depth) image signal. The enhanced standard dynamic range signal can then be up-converted to enhanced dynamic range video without banding artifacts in the areas corresponding to those certain codeword ranges.
Abstract: A standard dynamic range (SDR) image and a reference backward reshaping mapping are received. The reference backward reshaping mapping comprises a reference luma backward reshaping mapping. A color preservation mapping function is used with inputs generated from the SDR image and the reference backward reshaping mapping to determine luminance increase for SDR luma histogram bins generated based on luma codewords in the SDR image. A modified backward reshaping mapping is generated and comprises a modified luma backward reshaping mapping generated from the reference backward reshaping function based on the luminance increase for the SDR luma histogram bins. The SDR image and the modified backward reshaping mapping are encoded into an SDR video signal.
Abstract: For each content-mapped frame of a scene, it is determined whether the content mapped frame is susceptible to object fragmentation with respect to texture in a homogeneous region based on statistical values derived from the content-mapped image and a source image mapped into the content-mapped image. The homogeneous region is a region of consistent texture in the source image. Based on a count of content-mapped frames susceptible to object fragmentation in homogeneous region, it is determined whether the scene is susceptible to object fragmentation in homogeneous region. If so, an upper limit for mapped codewords for a prediction function for predicting codewords of a predicted image from the mapped codewords in the content-mapped image is adjusted. Mapped codewords above the upper limit are clipped to the upper limit.
Abstract: A tone-mapping function that maps input images of a high dynamic range into reference tone-mapped images of a relatively narrow dynamic range is generated. A luma forward reshaping function is derived, based on first bit depths and second bit depths, for forward reshaping luma codewords of the input images into forward reshaped luma codewords of forward reshaped images approximating the reference tone-mapped images. A chroma forward reshaping mapping is derived for predicting chroma codewords of the forward reshaped images. Backward reshaping metadata that is to be used by recipient devices to generate a luma backward reshaping function and a chroma backward reshaping mapping is transmitted with the forward reshaped images to the recipient devices. Techniques for the joint derivation of forward luma and chroma reshaping functions are also presented.
Abstract: Noise levels in pre-reshaped codewords of a pre-reshaped bit depth in pre-reshaped images within a time window of a scene are calculated. Per-bin minimal bit depth values are computed for pre-reshaped codeword bins based on the calculated noise levels in the pre-reshaped codewords. Each per-bin minimal bit depth value corresponds to a minimal bit depth value for a respective pre-reshaped codeword bin. A specific codeword mapping function for a specific pre-reshaped image in the pre-reshaped image is generated based on the pre-reshaped bit depth, the per-bin minimal bit depth values, and a target bit depth smaller than the pre-reshaped bit depth. The specific codeword mapping function is applied to specific pre-reshaped codewords of the specific pre-reshaped image to generate specific target codewords of the target bit depth for a specific output image.
Abstract: Real-time forward reshaping, comprising selecting a statistical sliding window that indexes with the current frame, having also, a look-back frame and a look-ahead frame, determining whether they are part of the current scene, determining a noise parameter, a luma transfer function and a luma forward reshaping function based on the luma transfer function and the noise parameter within the current scene, selecting a central tendency sliding window of the current frame and the look-back frame within the current scene, and determining a central tendency luma forward reshaping function. The chroma reshaping comprises analyzing statistics for the extended dynamic range (EDR) weights and EDR upper bounds, mapping these to standard dynamic range (SDR) weights and SDR upper bounds based on the central tendency luma forward reshaping function, determining a chroma content-dependent polynomial and a central tendency chroma forward reshaping polynomial and generating chroma MMR coefficients.
Abstract: In some embodiments, an encoder device is disclosed to generate single-channel standard dynamic range/high dynamic range content predictors. The device receives a standard dynamic range image content and a representation of a high dynamic range image content. The device determines a first mapping function to map the standard dynamic range image content to the high dynamic range image content. The device generates a single channel prediction metadata based on the first mapping function, such that a decoder device can subsequently render a predicted high dynamic range image content by applying the metadata to transform the standard dynamic range image content to the predicted high definition image content.
Abstract: In a method to reconstruct a high dynamic range video signal, a decoder receives parameters in the input bitstream to generate a prediction function. Using the prediction function, it generates a first set of nodes for a first prediction lookup table, wherein each node is characterized by an input node value and an output node value. Then, it modifies the output node values of one or more of the first set of nodes to generate a second set of nodes for a second prediction lookup table, and generates output prediction values using the second lookup table. Low-complexity methods to modify the output node value of a current node in the first set of nodes based on computing modified slopes between the current node and nodes surrounds the current node are presented.
Abstract: Methods and systems for adaptive chroma reshaping are discussed. Given an input image, a luma-reshaped image is first generated based on its luma component. For each chroma component of the input image, the range of the pixel values in the luma reshaped image is divided into bins, and for each bin a maximal scale factor is generated based on the chroma pixel values in the input image corresponding to the pixels of the luma reshaped image in the bin. A forward reshaping function is generated based on a reference reshaped function and the maximal scale factors, and reshaped chroma pixel values for the chroma component are generated based on the forward reshaping function and the corresponding pixel values in the luma reshaped image. Implementations options using look-up tables for mobile platforms with limited computational resources are also described.
Abstract: Input minimal noise levels of are computed over input codeword bins based on image content in input images of an input bit depth. The minimal noise levels are adjusted to generate approximated minimal noise levels of a higher bit depth. The approximated minimal noise levels are used to generate per-bin bit depths over the input codeword bins. The input codeword bins are classified into first codeword bins that have relatively high risks of banding artifacts and some other input codeword bins that have relatively low or zero risks of banding artifacts based on the per-bin bit depths. Portions of bit depths from the other input codeword bins are moved to the first input codeword bins to generate modified per-bin bit depths. A forward reshaping function constructed from the modified per-bin bit depths is used to reshape the input images into reshaped images used to generate output images.
Abstract: A standard dynamic range (SDR) image is received. Composer metadata of the first level through the N-th level is generated. Composer metadata of the j-th level is generated based on the composer metadata of the first level through (j?1)-th level. The composer metadata of the first level through the composer metadata of the j-th level is to be used for mapping the SDR image to the j-th target image specifically optimized for the j-th reference target display. The SDR image is encoded with the composer metadata of the first level through the k-th level in an output SDR video signal, where 1<=k<=N. A display device renders a display image derived from a composed target image composed from the SDR image based on the composer metadata of the first level through the k-th level in the output SDR video signal.
Abstract: Methods and systems for chroma reshaping are applied to images or video frames. The method comprises receiving at least one image or video frame. The color space of the at least one image or video frame is partitioned in M1×M2×M3 non-overlapping bins. For each bin it is determined whether it is a valid bin, for which the at least one image or video frame has at least one pixel with a color value falling within said bin. For each chroma channel, a required number of codewords is calculated for representing two color values in said valid bin that have consecutive codewords for the respective chroma channel without a noticeable difference. At least one content-aware chroma forward reshaping function is generated based on the calculated required numbers of codewords and applied to the at least one image or video frame.
Abstract: Coding syntaxes in compliance with same or different VDR specifications may be signaled by upstream coding devices such as VDR encoders to downstream coding devices such as VDR decoders in a common vehicle in the form of RPU data units. VDR coding operations and operational parameters may be specified as sequence level, frame level, or partition level syntax elements in a coding syntax. Syntax elements in a coding syntax may be coded directly in one or more current RPU data units under a current RPU ID, predicted from other partitions/segments/ranges previously sent with the same current RPU ID, or predicted from other frame level or sequence level syntax elements previously sent with a previous RPU ID. A downstream device may perform decoding operations on multi-layered input image data based on received coding syntaxes to construct VDR images.
Abstract: Given HDR and SDR video inputs representing the same content, segment-based methods are described to generate a backward-compatible reshaped SDR video which preserves the artistic intent or “look” of the inputs and satisfies other coding requirements. For each frame in a segment, reshaping functions are generated based on a support frames set determined based on a sliding window of frames that is adjusted based on scene cuts in the segment and which may include frames from both the current segment and neighboring segments. For luma reshaping, a mapping that preserves the cumulative density function of the luminance histogram values in the EDR and SDR inputs is combined with a minimum codeword allocation derived based on the EDR signal and the support frame set. For chroma reshaping, methods for segment-based forward and backward reshaping using multivariate, multi-regression models are also presented.
Abstract: In a method to code and transmit scalable HDR video signals, HDR signals are processed and encoded in the IPT-PQ color space to generate a base layer at reduced spatial resolution and/or dynamic range, and an enhancement layer with a residual signal. A signal reshaping block before the base layer encoder allows for improved coding of HDR signals using a reduced bit depth. A decoder can use a BL decoder and backward reshaping to generate a decoded BL HDR signal at a reduced dynamic range and/or spatial resolution, or it can combine the decoded BL HDR signal and the EL stream to generate a decoded HDR signal at full dynamic range and full resolution.
Abstract: In a method to reconstruct a high dynamic range video signal, a decoder receives parameters in the input bitstream to generate a prediction function. Using the prediction function, it generates a first set of nodes for a first prediction lookup table, wherein each node is characterized by an input node value and an output node value. Then, it modifies the output node values of one or more of the first set of nodes to generate a second set of nodes for a second prediction lookup table, and generates output prediction values using the second lookup table. Low-complexity methods to modify the output node value of a current node in the first set of nodes based on computing modified slopes between the current node and nodes surrounding the current node are presented.
Abstract: Relatively low dynamic range images or image partitions are converted into relatively high dynamic range images or image partitions that comprise reconstructed pixel values having a higher dynamic range than pixel values of the relatively low dynamic range images. Information relating to reconstructed pixel values of the relatively high dynamic range images and pixel values of the relatively low dynamic range images is collected. Prediction parameters are derived from the collected information. A predicted image or image partition is predicted from a relatively low dynamic range image or image partition based on the prediction parameters and comprises predicted pixel values having the higher dynamic range than pixel values of the relatively low dynamic range image or image partition.
Abstract: The present invention relates generally to images. More particularly, an embodiment of the present invention relates to the pixel group segmented quantization and de-quantization of the residual signal in layered coding of high dynamic range images. By assigning the pixels in the residual image to different pixel groups based on the pixel value of the corresponding pixel in the decoded base layer signal, and by applying pixel group quantizing functions to assigned pixels a more efficient coding can be achieved.
January 25, 2016
Date of Patent:
January 7, 2020
Dolby Laboratories Licensing Corporation, Dolby International AB
Klaas Heinrich Schueuer, Uwe Michael Kowalik, Arion Neddens, Philipp Kraetzer, Guan-Ming Su