Optimized Filter Selection for Reference Picture Processing
Reference processing may be used in a video encoder or decoder to derive reference pictures that are better correlated with a source image to be encoded or decoded, which generally yields better coding efficiency. Methods for filter selection for a reference processing unit adapted for use in a video codec system are discussed. Specifically, methods for filter selection based on performing motion estimation and obtaining distortion/cost information by comparing reference pictures, either processed or non-processed, with the source image to be encoded are discussed.
This application claims priority to U.S. Provisional Patent Application No. 61/389,180 filed 1 Oct. 2010. The present application may be related to U.S. Provisional Application No. 61/170,995, filed on Apr. 20, 2009, U.S. Provisional Application No. 61/223,027, filed on Jul. 4, 2009, U.S. Provisional Application No. 61/300,427, filed on Feb. 1, 2010, all of which are incorporated herein by reference in their entirety.
TECHNOLOGYThe present invention relates generally to video processing. More specifically, an embodiment of the present invention relates to optimized reference processing filter selection methods.
BACKGROUNDMulti-layered video codecs provide, for instance, scalability in spatial and temporal resolution, bit-depth, color gamut, and quality. A number of multi-layered video codecs has been standardized by the video coding community. Among the standardized multi-layered video codecs are the Multiview Video Coding extension (MVC) and the Scalable Video Coding (SVC) extension of the MPEG-4 AVC/H.264 standard. An exemplary reference that introduces the H.264 standard can be found in “Advanced video coding for generic audiovisual services”, http://www.itu.int/rec/T-REC-H.264/e, dated March 2010, which is incorporated herein by reference in its entirety.
According to a first aspect of the disclosure, a method for selecting a particular filter from among a plurality of filters, the particular filter adapted for use in a reference picture processing unit in a multi-layered video coding system, the multi-layered video coding system comprising a base layer and one or more enhancement layers is provided, the method comprising: a) providing a reference picture and an enhancement layer source picture; b) filtering copies of the reference picture using at least one filter from the plurality of filters to obtain at least one filtered reference picture, wherein each filter is applied to a corresponding copy of the reference picture; c) performing disparity estimation based on the enhancement layer source picture and a full set or subset of the at least one filtered reference picture, wherein the disparity estimation is adapted to generate disparity information; and d) selecting the particular filter based on comparing the disparity information generated in step c), wherein the disparity information is a function of at least one of motion vectors, prediction distortion, intra prediction information, illumination parameters, luma components, chroma components, and tone mapping parameters.
According to a second embodiment of the disclosure, a method for selecting a particular filter from among a plurality of filters, the particular filter adapted for use in a reference picture processing unit in a multi-layered video coding system, the multi-layered video coding system comprising a base layer and one or more enhancement layers is provided, the method comprising: a) providing a reference picture and an enhancement layer source picture; b) filtering copies of the reference picture using at least one filter from the plurality of filters to obtain at least one filtered reference picture, wherein each filter is applied to a corresponding copy of the reference picture; c) performing disparity estimation based on the enhancement layer source picture and a full set or subset of the at least one filtered reference picture, wherein the disparity estimation is adapted to generate disparity information; d) obtaining distortion information based on the disparity information; and e) selecting the particular filter based on comparing the distortion information generated in step d), wherein the disparity information is a function of at least one of motion vectors, prediction distortion, intra prediction information, illumination parameters, luma components, chroma components, and tone mapping parameters.
According to a third embodiment of the disclosure, a method for selecting a particular filter from among a plurality of filters, the particular filter adapted for use in a reference picture processing unit in a multi-layered video coding system, the multi-layered video coding system comprising a base layer and one or more enhancement layers is provided, the method comprising: a) providing an enhancement layer source picture; b) performing disparity estimation based on the enhancement layer source picture and motion information from a particular layer, wherein the disparity estimation is adapted to generate disparity information; c) obtaining distortion information based on the enhancement layer source picture and the motion information from the particular layer; and d) selecting the particular filter based on comparing the distortion information acquired in step c), wherein the disparity information is a function of at least one of motion vectors, prediction distortion, intra prediction information, illumination parameters, luma components, chroma components, and tone mapping parameters, and wherein the motion information from the particular layer is based on temporal reference pictures of the particular layer.
According to a fourth embodiment of the disclosure, a method for selecting a particular filter from among a plurality of filters, the particular filter adapted for use in a reference picture processing unit in a multi-layered video coding system, the multi-layered video coding system comprising a base layer and one or more enhancement layers is provided, the method comprising: a) providing a reference picture and an enhancement layer source picture; b) filtering copies of the reference picture using at least one filter from the plurality of filters to obtain at least one filtered reference picture, wherein each filter is applied to a corresponding copy of the reference picture; c) performing disparity estimation based on the enhancement layer source picture, a full set or subset of the at least one filtered reference picture, and motion information from a particular layer, wherein the disparity estimation is adapted to generate disparity information; d) obtaining distortion information based on the enhancement layer source picture, the full set or subset of the at least one filtered reference picture, and motion information from the particular layer; and e) selecting the particular filter based on comparing the distortion information acquired in step d), wherein the disparity information is a function of at least one of motion vectors, prediction distortion, intra prediction information, illumination parameters, luma components, chroma components, and tone mapping parameters, and wherein the motion information is based on temporal reference pictures of the particular layer.
According to a fifth embodiment of the disclosure, a method for selecting a particular filter from among a plurality of filters, the particular filter adapted for use in a reference picture processing unit in a video coding system, the coding system comprising a layer is provided, the method comprising: a) providing a reference picture and a source picture, wherein both the reference picture and the source picture are from the same layer; b) filtering copies of the reference picture using at least one filter from the plurality of filters to obtain at least one filtered reference picture, wherein each filter is applied to a corresponding copy of the reference picture; c) performing disparity estimation based on the source picture and a full set or subset of the at least one filtered reference picture, wherein the disparity estimation is adapted to generate disparity information; d) obtaining distortion information based on the disparity information; and e) selecting the particular filter based on comparing the distortion information generated in step d), wherein the disparity information is a function of at least one of motion vectors, prediction distortion, intra prediction information, illumination parameters, luma components, chroma components, and tone mapping parameters.
According to a sixth embodiment of the disclosure, a filter selector adapted for selecting a particular filter from among a plurality of filters, the particular filter adapted for use in a reference processing unit in a multi-layered video coding system, the multi-layered video coding system comprising a base layer and one or more enhancement layers is provided, the filter selector comprising: a full set or subset of the plurality of filters for processing a reference picture or a region of the reference picture to obtain one or more processed reference pictures; and a disparity estimator adapted to generate disparity information based on an enhancement layer source picture and at least one processed reference picture from the one or more processed reference pictures, wherein the disparity information is a function of at least one of motion vectors, prediction distortion, intra prediction information, illumination parameters, luma components, chroma components, and tone mapping parameters, wherein the particular filter is selectable based on the disparity information.
According to a seventh embodiment of the disclosure, a filter selector adapted for selecting a particular filter from among a plurality of filters, the particular filter adapted for use in a reference processing unit in a multi-layered video coding system, the multi-layered video coding system comprising a base layer and one or more enhancement layers is provided, the filter selector comprising: a full set or subset of the plurality of filters for processing a reference picture or a region of the reference picture to obtain one or more processed reference pictures; and a disparity estimator adapted to generate disparity information based on an enhancement layer source picture and at least one processed reference picture from the one or more processed reference pictures, wherein the disparity information is a function of at least one of motion vectors, prediction distortion, intra prediction information, illumination parameters, luma components, chroma components, and tone mapping parameters; and a distortion information computation module adapted to generate distortion information based on the enhancement layer source picture and the at least one processed reference picture from the plurality of processed reference pictures, wherein the particular filter is selectable based on the distortion information.
According to an eighth embodiment of the disclosure, a filter selector adapted for selecting a particular filter from among a plurality of filters, the particular filter adapted for use in a reference processing unit in a multi-layered video coding system, the multi-layered video coding system comprising a base layer and one or more enhancement layers is provided, the filter selector comprising: a disparity estimator adapted to generate disparity information based on an enhancement layer source picture and motion information from a particular layer, wherein the disparity information is a function of at least one of motion vectors, prediction distortion, intra prediction information, illumination parameters, luma components, chroma components, and tone mapping parameters, intra prediction information, illumination parameters, and tone mapping parameters; and a distortion information computation module adapted to generate distortion information based on the enhancement layer source picture and the motion information from the particular layer, wherein the particular filter is selected based on the distortion information.
In multi-layered video codecs such as MVC, several encoding/decoding processes, depending on number of views, may be used in order to encode/decode a base layer and several enhancement layer image sequences, where each enhancement layer usually corresponds to a different view. For instance, several independent and dependent encoding/decoding processes may be used in combination in order to encode/decode different views. The independently coded views are typically called the base layer views while the dependently coded views are called the enhancement layer views.
It should be noted that the video codec system implementation of
Similar to
Additional processing provided by the RPU (230) may, by way of example and not of limitation, involve linear/non-linear filtering, motion transformation, motion compensation, illumination compensation, scaling, inverse and forward tone mapping, color format conversion, and gamma correction. The processing may be applied at a region-level on a reference picture, thereby enabling processing with methods of different characteristics to be used for different portions of the reference picture. Processing parameters can be derived in the encoder (202, 212), taking into account final reconstructed quality of an output video as well as available bandwidth and computing power, and then signaled to the decoder (204, 214). It should be noted that the term “processing”, as used in this disclosure, is equivalent to the term “filtering”. Consequently, processing on a reference picture may be performed by applying filters to the reference picture.
An exemplary reference that introduces enabling filters with different characteristics to be used for different portions of a reference picture is U.S. Provisional Application No. 61/170,995, entitled “Directed Interpolation and Post-Processing”, filed on Apr. 20, 2009. Two exemplary references that introduce methods for adaptively deriving such filters based on content characteristics are U.S. Provisional Application No. 61/170,995, entitled “Directed Interpolation and Post-Processing”, filed on Apr. 20, 2009, and U.S. Provisional Application No. 61/300,427, entitled “Adaptive Interpolation Filters for Multi-layered Video Delivery”, filed on Feb. 1, 2010. Both these references are incorporated herein by reference in their entirety.
An application of the reference processing based architecture, such as shown in
Many embodiments of the present disclosure involve derivation of reference processing filters, generally for use within a reference processing unit of an encoder, that conform to the architecture presented in
In embodiments involving application in stereoscopic video, the reference processing filters can be derived as shown in
In one embodiment, the filter selection may be obtained using a Lagrangian optimization technique. By way of example and not of limitation, a Lagrangian optimization technique shown below in Equation (1) can be used to derive a filter based on cost-distortion criteria:
where f denotes a filter index identifying the particular filter under consideration, Df denotes distortion in the reference processed image when compared to the source enhancement layer image, and Cf denotes filter cost incurred due to use of filter f.
The distortion Df can be computed using a variety of techniques including sum of absolute or squared error between pixels, mean square error, PSNR, weighted sum of transformed square errors, sum of transformed absolute errors, SSIM, multiscale SSIM, and other perceptual image/video quality metrics. The cost Cf is generally a function of number of bits used to signal the filter parameters. However, the cost may also be a function of other factors. For instance, in a power/complexity constrained application, the cost may also consider computational complexity and power requirements of applying the filter. In the case that the cost includes multiple factors including but not limited to number of bits, computational complexity, and power requirements, multiple lambda parameters may be used in order to separately tune influence of each factor on overall cost.
In the embodiment shown in
Specifically, the motion compensation (442), motion estimation (444), and rate distortion optimization (446) processes of the enhancement layer encoder are applied based on reference pictures from the enhancement layer reference picture buffer (440). As such, the processes (442, 444, 446) are applied using processed images from the base layer, or previous enhancement layers, as well as the temporal reference pictures from the enhancement layer reference picture buffer (440). In this embodiment, coding efficiency of the RPU filters (not shown) in the enhancement layer does not take into account the processes of motion compensation (442), motion estimation (444), and rate distortion optimization (446).
In this embodiment, the motion estimation (525) and mode decision, which occurs along with the rate distortion optimization (530), of the enhancement layer encoder will use the filtered base layer picture (515) as a potential reference, in addition to other temporal reference pictures, and generate distortion and cost estimates of using a particular RPU filter (510). Factors that determine whether a potential reference is actually used as a reference picture may depend on, for instance, distortion, cost, and computational complexity that result from utilization of the potential reference.
The cost estimates may include, for instance, motion costs such as costs for signaling of motion vectors that refer to each reference picture as well as costs for encoding prediction residuals in addition to filter costs. Filter costs may include cost of signaling filter parameters such as filter types, filter coefficients, and so forth.
In one embodiment, the distortion may be computed using final reconstructed pixel values after motion compensation and coding of enhancement layer source images. After performing the motion estimation (525) and rate-distortion optimization (530) processes for each of the possible filters (510), the rate-distortion optimization (530) outputs rate-distortion cost to a filter selector (550), which selects one or more filters based on the rate-distortion cost. In some embodiments, one filter can be selected for each region of the base layer reference picture (500). Additionally, coding modes for an enhancement layer picture can be chosen, where the coding modes are used by the enhancement layer encoder to code blocks, macroblocks, and/or regions into the enhancement layer picture.
Subsequent to the rate-distortion optimization process (530), the filtered base layer picture (515) may be removed (535) from the enhancement layer reference buffer (520). In certain embodiments, the same filtered base layer picture (515) may be subsequently re-used in a case of multi-pass encoding or re-evaluated using different criteria.
Note that
Additionally, in the embodiment shown in
According to other embodiments of the present disclosure, the filter selection process can perform a single pass encoding and base the filter selection decision on the distortion between the filtered images (500) and the source images (545) and the cost of motion estimation and mode decision. Additionally, motion estimation may be performed on a subsampled image or a smaller region within the image to reduce computational complexity.
It should be noted that
It should be noted that motion estimation (525) and rate distortion optimization (530) is used in this particular embodiment shown in
In the embodiment shown in
Note that in generating the cost associated with the motion estimation process (620), such as signaling of motion vectors, consideration can also be given to the possibility that motion vectors of a particular block may be predicted from the motion vectors of neighboring blocks. In one embodiment, block size used for the motion estimation process (620) can be chosen to be constant. In another embodiment, the block size can be chosen based on image characteristics such as edge features, texture, color, shape, and size of elements in the image under consideration. In yet another embodiment, in addition to the motion estimation process (620), a mode decision process (not shown) may also be performed in order to determine size and type of a block, illumination parameters, transformation parameters, quantization parameters, and so forth for each block.
Also, in one embodiment, the motion estimation may involve the luma and/or chroma components. In that case, the distortion may be computed based on distortion of one or more components or may include a combination of the motion estimated distortion of the one or more components and distortion assuming zero motion for other components (whether it be a luma component or other chroma components). In another embodiment, the motion estimation may be performed using only a subset of the components. The motion vectors derived from the subset of the components may be used to determine the motion estimated distortion of the other components. The distortion obtained from the different components may also be weighted differently for each component when obtaining the combined distortion over all the components.
In order to simplify the motion estimation (720) and mode decision processes, a smaller set of temporal references (730) from the enhancement layer picture buffer (702) than the ones actually available to the encoder may be selected in a temporal region selector (740), which may be designed to ignore less important temporal references (730). In one embodiment, the importance of the temporal references (730) may be determined based on a temporal distance from a current enhancement layer picture (750) to be encoded.
For example, assuming that the current picture (750) is at time t, only available references from time t−1 and t+1 may be used. In another embodiment, the importance of the temporal references (730) may be determined based on a correlation metric that determines correlation between each of the temporal references (730) in the reference picture buffer (702) and the current picture (750) to be encoded and only M most correlated temporal references will be used in RPU filter decision process, where M is an arbitrary number. Note that in the above embodiments, a bi-predictive or multi-hypothesis search may also be performed between the base layer filtered references (712) and the temporal references (730) in order to generate a more accurate prediction of a subsequent enhancement layer picture.
If a known relationship or transformation such as, for example, a change in spatial or temporal resolution, rotation, translation, etc., exists between the base and enhancement layer images, then the base layer information (880) such as motion vectors and mode decisions can be similarly transformed prior to using the motion vectors and mode decisions from the base layer to generate the estimates for the enhancement layer.
In another embodiment, the relation or transformation, applied on motion vectors of the base layer (or inter-layer) in order to derive the estimated motion vectors for the enhancement layer, may be determined based on the motion vectors derived in previously coded pictures or regions for the enhancement layer. Differences between the motion vectors of each layer in previously coded pictures can also be used as a guide to determine a confidence level in the motion estimates. Similarly, the distortion estimates obtained by re-using the motion vectors may be weighted by a confidence level of the distortion estimates. Given the temporal distortion and cost estimates, as well as the motion estimation results from filtered base layer references (812) (or inter-layer references), a reference selection (860) and mode decision (890) process can be performed that takes into account both the temporal references (830) and the filtered base layer references (812) (or inter-layer references) without necessarily increasing computational complexity of the filter selection process.
With continued reference to
In another embodiment, a motion search range may also be adapted based on filter type. Computational complexity of motion estimation can also be reduced by providing more accurate motion vector predictors, which may be derived from motion vectors, around which a motion search is performed. The motion vectors can be extracted from the base layer, inter-layers, or temporally from the enhancement layer itself. Spatial (intra prediction) may also be used. For example, with reference back to the embodiment of
In the case of the base layer filtered reference pictures (712, 812), the motion vector predictors can also be derived based on filter type used to generate the reference picture (712, 812). For example, one possible “filter” that can be used in the RPU is that of simply copying a base layer reference picture without any additional filtering. In this case, assuming there is a known phase offset between base and enhancement layer images, a motion vector predictor can be used that accounts for the phase offset. Additional reduction in computational complexity can also be obtained by reusing information from previously coded pictures, where the previously coded pictures could include temporal pictures and pictures from various layers. A number of techniques for reusing information from previously coded pictures for filter selection can be applied in this case.
According to some embodiments as shown in
For example, encoding decisions generally made by the RPU may include determining number and shape of the regions over which a base layer reference picture (870) is divided prior to filtering. Since the cost of signaling the filter parameters increases with the number of regions, region sizes in the RPU are generally assigned to be larger than the block sizes used for motion estimation/compensation.
In one embodiment, by using the map of the filter distortion and cost as well as the temporal distortion and cost, the filter selection process can choose the region size to be equal to the size of a smallest contiguous set of blocks for which the selected filter remains the same.
In another embodiment, different region sizes and shapes may be tested in the filter selector (890) for both filter distortion and filter cost, and the best performing shape and size for a given distortion and/or cost criteria may be chosen. Additionally, edge detection may be performed on the base layer reference picture (770, 870) in order to determine the number and shape of the regions that are likely to be used by the enhancement layer encoder.
In another embodiment, the RPU may determine that use of the temporal references (730, 830) provides a more desirable trade-off between distortion and cost for an entire picture or a slice of the picture. Alternatively, the RPU may determine that use of the base layer filtered reference (770, 870) provides the more desirable trade-off. In either case, the RPU may signal to the enhancement layer encoder to modify ordering of the reference pictures (730, 770, 830, 870) within reference picture buffers (700, 702, 800, 802) such that more important references, given by the references (730, 770, 830, 870) that provide the more desirable distortion/cost trade-off, can be signaled using fewer bits than less important references. In other words, the more important references, which are used more frequently for prediction purposes than the less important references, are generally encoded such that the more important references take fewer bits to signal than the less important references. In another embodiment, if only the temporal references (830) are deemed to be sufficient, then the RPU can be disabled for the current picture to be encoded, saving both computational time and memory.
In conclusion, embodiments of the present disclosure provide a set of schemes for accomplishing filter selection of a filter for use in reference processing units in a multi-layered codec. Specifically, the selected filter may be used to provide a filtered previously coded layer picture, which is used as a reference picture for an enhancement layer. It should be noted, however, that similar principles can be applied in the case of a single layered codec.
For instance, the single layered codec could use temporal references that may be used for prediction in the single layer. Prior to the prediction, the temporal references may be processed utilizing global or regional motion compensation, prefiltering, and so forth.
The methods and systems described in the present disclosure may be implemented in hardware, software, firmware, or combination thereof. Features described as blocks, modules, or components may be implemented together (e.g., in a logic device such as an integrated logic device) or separately (e.g., as separate connected logic devices). The software portion of the methods of the present disclosure may comprise a computer-readable medium which comprises instructions that, when executed, perform, at least in part, the described methods. The computer-readable medium may comprise, for example, a random access memory (RAM) and/or a read-only memory (ROM). The instructions may be executed by a processor (e.g., a digital signal processor (DSP), an application specific integrated circuit (ASIC), or a field programmable logic array (FPGA)).
As described herein, an embodiment of the present invention may thus relate to one or more of the example embodiments that are enumerated in Table 1, below. Accordingly, the invention may be embodied in any of the forms described herein, including, but not limited to the following Enumerated Example Embodiments (EEEs) which described structure, features, and functionality of some portions of the present invention.
Furthermore, all patents and publications mentioned in the specification may be indicative of the levels of skill of those skilled in the art to which the disclosure pertains. All references cited in this disclosure are incorporated by reference to the same extent as if each reference had been incorporated by reference in its entirety individually.
The examples set forth above are provided to give those of ordinary skill in the art a complete disclosure and description of how to make and use the embodiments of the reference processing filter selection methods of the disclosure, and are not intended to limit the scope of what the inventors regard as their disclosure. Modifications of the above-described modes for carrying out the disclosure may be used by persons of skill in the video art, and are intended to be within the scope of the following claims.
It is to be understood that the disclosure is not limited to particular methods or systems, which can, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting. As used in this specification and the appended claims, the singular forms “a,” “an,” and “the” include plural referents unless the content clearly dictates otherwise. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the disclosure pertains.
A number of embodiments of the disclosure have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the present disclosure. Accordingly, other embodiments are within the scope of the following claims.
Claims
1-31. (canceled)
32. A method for selecting a particular filter from among a plurality of filters, the particular filter adapted for use in a reference picture processing unit in a multi-layered video coding system, the multi-layered video coding system comprising a base layer and one or more enhancement layers, the method comprising: wherein the disparity information comprises a function of at least one of motion vectors, prediction distortion, intra prediction information, illumination parameters, luma components, chroma components, and tone mapping parameters.
- a) providing a reference picture and an enhancement layer source picture;
- b) filtering copies of the reference picture using at least one filter from the plurality of filters to obtain at least one filtered reference picture, wherein each filter is applied to a corresponding copy of the reference picture;
- c) performing disparity estimation based on the enhancement layer source picture and a full set or subset of the at least one filtered reference picture, wherein the disparity estimation is adapted to generate disparity information, the disparity information being further based on temporal reference pictures from an enhancement layer picture buffer;
- d) obtaining distortion information based on the disparity information;
- e) selecting the particular filter based on comparing the distortion information generated in the distortion information obtaining step d); and
- f) for each filtered reference picture on which disparity estimation has been performed in accordance with step c), selecting between said reference picture and a corresponding temporal reference picture from the enhancement layer picture buffer,
33. The method as recited in claim 32, wherein the reference picture is encoded after the filtering step.
34. The method as recited in claim 32, wherein the reference picture comprises a base layer reference picture or an inter-layer reference picture, wherein an inter-layer comprises a layer from among the one or more enhancement layers.
35. The method as recited in claim 32, wherein the reference picture comprises a spatial reference picture or a temporal reference picture from the enhancement layer.
36. The method as recited in claim 32, wherein the step of providing further comprises processing the enhancement layer source picture, wherein the processing removes noise in the enhancement layer source picture.
37. The method as recited in claim 32, wherein the step of providing further comprises processing the enhancement layer source picture, wherein the processing involves applying at least one of filtering, motion transformation, motion compensation, illumination compensation, scaling, inverse and forward tone mapping, color format conversion, and gamma correction.
38. The method as recited in claim 32, wherein at least one of the filtered reference pictures comprises the reference picture.
39. The method as recited in claim 32, wherein the step of performing disparity estimation further comprises obtaining cost information for the at least one filter from among the plurality of filters used in the step of filtering, and wherein the step of selecting is further based on the cost information; and wherein the cost information comprises a function of the disparity information, number of bits to be used in signaling filter parameters of each filter, number of bits to be used in signaling the motion vectors corresponding to each filtered reference picture, number of bits to be used in signaling the prediction distortion corresponding to each filtered reference picture, computational complexity in applying each filter, and power consumption of each filter.
40. The method as recited in claim 32, further comprising, between the step of performing and the step of selecting, a step of performing disparity compensation on the enhancement layer source picture to obtain a final reconstructed picture, wherein the disparity compensation is based on the step of disparity estimation.
41. The method as recited in claim 32, wherein each of the steps is performed on regions of the reference picture.
42. The method as recited in claim 41, wherein the reference picture is decomposed using a plurality of region sizes and region shapes to obtain a plurality of reconstructed reference pictures, and wherein each of the steps are performed on the plurality of reconstructed reference pictures.
43. The method as recited in claim 41, wherein the region sizes and the region shapes are determined based on performing edge detection on the reference picture.
44. The method as recited in claim 32, wherein the disparity estimation comprises block-based motion estimation.
45. The method as recited in claim 44, wherein motion vectors corresponding to a particular block are adapted to be predicted by motion vectors of blocks neighboring the particular block.
46. The method as recited in claim 44, wherein block size is based on image characteristics of the reference picture, wherein the image characteristics comprise a function of at least one of a luma component, a chroma component, and edge characteristics of the reference picture and texture, color, shape, and size of elements in the reference picture.
47. The method as recited in claim 44, wherein the step of performing disparity estimation or the step of obtaining distortion information also determines at least one of block size and block shape.
48. The method as recited in claim 32, wherein the disparity estimation comprises integer pixel motion estimation.
49. The method as recited in claim 32, wherein the disparity estimation comprises sub-pixel accurate motion estimation.
50. The method as recited in claim 32, wherein the disparity estimation is further based on one or more luma and chroma components.
51. The method as recited in claim 50, wherein the disparity estimation is further based on a subset of luma and chroma components, and wherein distortion of remaining luma and chroma components are computed based on the disparity information obtained from the subset of luma and chroma components.
52. The method as recited in claim 51, further comprising a plurality of weighting factors, wherein one weighting factor is applied to each luma component and each chroma component.
53. The method as recited in claim 32, wherein the step of providing comprises providing a plurality of reference pictures and the step of filtering is performed on a full set or subset of the plurality of reference pictures to obtain at least one filtered reference picture, and wherein each filter is applied to each reference picture of the full set or subset of the plurality of reference pictures.
54. A system for selecting a particular filter from among a plurality of filters, the particular filter adapted for use in a reference picture processing unit in a multi-layered video coding system, the multi-layered video coding system comprising a base layer and one or more enhancement layers, the system comprising:
- a) means for providing a reference picture and an enhancement layer source picture;
- b) means for filtering copies of the reference picture using at least one filter from the plurality of filters to obtain at least one filtered reference picture, wherein each filter is applied to a corresponding copy of the reference picture;
- c) means for performing disparity estimation based on the enhancement layer source picture and a full set or subset of the at least one filtered reference picture; and wherein the disparity estimation is adapted to generate disparity information, the disparity information being further based on temporal reference pictures from an enhancement layer picture buffer;
- d) means for obtaining distortion information based on the disparity information;
- e) means for selecting the particular filter based on comparing the distortion information generated with the distortion information obtaining means d); and
- f) means for selecting, for each filtered reference picture on which disparity estimation has been performed in accordance with step c), between said reference picture and a corresponding temporal reference picture from the enhancement layer picture buffer, wherein the disparity information comprises a function of at least one of motion vectors, prediction distortion, intra prediction information, illumination parameters, luma components, chroma components, and tone mapping parameters.
55. A computer readable storage medium, comprising instructions that are tangibly encoded therewith, which when executed with a processor, cause the processor to cause, control, program or configure, at least in part, a process for selecting a particular filter from among a plurality of filters, the particular filter adapted for use in a reference picture processing unit in a multi-layered video coding system, the multi-layered video coding system comprising a base layer and one or more enhancement layers, the process comprising: wherein the disparity information comprises a function of at least one of motion vectors, prediction distortion, intra prediction information, illumination parameters, luma components, chroma components, and tone mapping parameters.
- a) providing a reference picture and an enhancement layer source picture;
- b) filtering copies of the reference picture using at least one filter from the plurality of filters to obtain at least one filtered reference picture, wherein each filter is applied to a corresponding copy of the reference picture;
- c) performing disparity estimation based on the enhancement layer source picture and a full set or subset of the at least one filtered reference picture, wherein the disparity estimation is adapted to generate disparity information, the disparity information being further based on temporal reference pictures from an enhancement layer picture buffer;
- d) obtaining distortion information based on the disparity information;
- e) selecting the particular filter based on comparing the distortion information generated in the distortion information obtaining step d); and
- f) for each filtered reference picture on which disparity estimation has been performed in accordance with step c), selecting between said reference picture and a corresponding temporal reference picture from the enhancement layer picture buffer,
56. A method for selecting a particular filter from among a plurality of filters, the particular filter adapted for use in a reference picture processing unit in a multi-layered video coding system, the multi-layered video coding system comprising a base layer and one or more enhancement layers, the method comprising: wherein the disparity information comprises a function of at least one of motion vectors, prediction distortion, intra prediction information, illumination parameters, luma components, chroma components, and tone mapping parameters, and wherein the motion information is based on temporal reference pictures of the particular layer.
- a) providing a reference picture and an enhancement layer source picture;
- b) filtering copies of the reference picture using at least one filter from the plurality of filters to obtain at least one filtered reference picture, wherein each filter is applied to a corresponding copy of the reference picture;
- c) performing disparity estimation based on the enhancement layer source picture, a full set or subset of the at least one filtered reference picture, and motion information from a particular layer, wherein the disparity estimation is adapted to generate disparity information, the disparity information being further based on temporal reference pictures from an enhancement layer picture buffer;
- d) obtaining distortion information based on the enhancement layer source picture, the full set or subset of the at least one filtered reference picture, and motion information from the particular layer;
- e) selecting the particular filter based on comparing the distortion information acquired in the distortion information obtaining step d); and
- f) for each filtered reference picture on which disparity estimation has been performed in accordance with step c), selecting between said reference picture and a corresponding temporal reference picture from the enhancement layer picture buffer,
Type: Application
Filed: Sep 19, 2011
Publication Date: Aug 1, 2013
Patent Grant number: 9270871
Applicant: DOLDY LABORATORIES LICENSING CORPORATION (San Francisco, CA)
Inventors: Peshala V. Pahalawatta (Glendale, CA), Yuwen He (San Diego, CA), Alexandros Tourapis (Milpitas, CA)
Application Number: 13/877,140
International Classification: H04N 5/21 (20060101);