IMAGE ENCODING/DECODING METHOD AND DEVICE
The present invention relates to an image encoding/decoding method and device. In an image decoding method and device according to an embodiment of the present invention, a reconstructed pixel region within an image to which a current block to be decoded belongs is selected; a motion vector of the reconstructed pixel region is derived on the basis of the reconstructed pixel region and a reference image of the current block; and the derived motion vector is selected as a motion vector of the current block.
Latest INDUSTRY ACADEMY COOPERATION FOUNDATION OF SEJONG UNIVERSITY Patents:
- Video encoding/decoding method, apparatus, and recording medium having bitstream stored thereon
- IMAGE ENCODING METHOD/DEVICE, IMAGE DECODING METHOD/DEVICE AND RECORDING MEDIUM HAVING BITSTREAM STORED THEREON
- METHOD AND APPARATUS FOR EDGE DETECTION FOR DIFFRACTION OF SOUND TRACING
- Image encoding method/device, image decoding method/device and recording medium having bitstream stored therein
- Method and apparatus for encoding/decoding image
The present invention relates to an image signal encoding/decoding method and device. More particularly, the present invention relates to an image encoding/decoding method using inter prediction and an image encoding/decoding device using inter prediction.
BACKGROUND ARTRecently, demand for multimedia data such as video has rapidly increased on the Internet. However, the rate at which a bandwidth of a channel has developed is insufficient to keep up with the amount of multimedia data that has rapidly increased. Considering this situation, the Video Coding Expert Group (VCEG) of ITU-T and the Moving Picture Expert Group (MPEG) of ISO/IEC, which are the International Organizations for Standardization, established the High Efficiency Video Coding (HEVC) version 1, a video compression standard, in February 2014.
HEVC uses a variety of technologies such as intra prediction, inter prediction, transform, quantization, entropy encoding, and in-loop filtering. In inter prediction of HEVC, new technologies such as block merging, advanced motion vector prediction (AMVP) have been applied such that efficient inter prediction is possible. However, when multiple motions are present in a block, the block is partitioned into small parts, so that rapid increase in overhead may occur and encoding efficiency may be lowered.
DISCLOSURE Technical ProblemAccordingly, the present invention has been made keeping in mind the above problems, and the present invention is intended to enhance efficiency of inter prediction by providing improved inter prediction.
Also, the present invention is intended to provide a motion vector derivation method by an image decoding device, where an image encoding device does not need to transmit motion vector information to the image decoding device.
Also, the present invention is intended to provide a motion vector derivation method of a control point by an image decoding device, wherein in affine inter prediction, an image encoding device does not need to transmit a motion vector of the control point to the image decoding device.
Also, the present invention is intended to provide inter prediction capable of efficient encoding or decoding when multiple motions are present in one block.
Also, the present invention is intended to reduce blocking artifacts that may occur when one block is partitioned into multiple regions and encoding or decoding is performed using different types of inter prediction.
Also, the present invention is intended to enhance efficiency of inter prediction by partitioning a current block to be encoded or decoded, on the basis of a partitioning structure of reconstructed neighboring blocks.
Also, the present invention is intended to enhance efficiency of inter prediction by partitioning, on the basis of a partitioning structure of reconstructed neighboring blocks, a pre-reconstructed neighboring image region which is used to encode or decode a current block.
Also, the present invention is intended to enhance efficiency of image encoding or decoding by performing encoding or decoding using a current block or a neighboring image partitioned as described above.
Technical SolutionIn an image decoding method and device according to an embodiment of the present invention, a reconstructed pixel region within an image to which a current block to be decoded belongs is selected; on the basis of the reconstructed pixel region and a reference image of the current block, a motion vector of the reconstructed pixel region is derived; and the derived motion vector is selected as a motion vector of the current block.
The reconstructed pixel region may include at least one among a region adjacent to an upper side of the current block and a region adjacent to a left side of the current block.
The motion vector of the reconstructed pixel region may be derived on the basis of a position of a region corresponding to the reconstructed pixel region, wherein the region corresponding to the reconstructed pixel region is determined within the reference image.
In an image encoding method and device according to an embodiment of the present invention, a reconstructed pixel region within an image to which a current block to be encoded belongs is selected; on the basis of the reconstructed pixel region and a reference image of the current block, a motion vector of the reconstructed pixel region is derived; and the derived motion vector is selected as a motion vector of the current block.
Also, in the image encoding method and device according to the embodiment of the present invention, decoder-side motion vector derivation indication information may be generated and encoded.
The decoder-side motion vector derivation indication information may indicate whether or not the derived motion vector of the reconstructed pixel region is selected as the motion vector of the current block.
In an image decoding method and device according to another embodiment of the present invention, at least one reconstructed pixel region is selected within an image to which a current block to be decoded using affine inter prediction belongs; on the basis of the at least one reconstructed pixel region and a reference image of the current block, a motion vector of the at least one reconstructed pixel region is derived; and the derived motion vector of the at least one reconstructed pixel region is selected as a motion vector of at least one control point of the current block.
The at least one reconstructed pixel region may be a region adjacent to the at least one control point of the current block.
Further, the at least one control point may be positioned at an upper left side, an upper right side, or a lower left side of the current block.
Further, the motion vector of the control point positioned at a lower right side of the current block may be decoded on the basis of motion information included in a bitstream.
Further, in the image decoding method and device according to the embodiment of the present invention, decoder-side control point motion vector derivation indication information may be decoded.
In the image decoding method and device according to the embodiment of the present invention, the motion vector of the at least one reconstructed pixel region may be derived on the basis of the decoder-side control point motion vector derivation indication information.
In an image encoding method and device according to still another embodiment of the present invention, at least one reconstructed pixel region is selected within an image to which a current block to be encoded using affine inter prediction belongs; on the basis of the at least one reconstructed pixel region and a reference image of the current block, a motion vector of the at least one reconstructed pixel region is derived; and the derived motion vector of the at least one reconstructed pixel region is selected as a motion vector of at least one control point of the current block.
In an image decoding method and device according to yet still another embodiment of the present invention, a current block to be decoded is partitioned into multiple regions including a first region and a second region; and a prediction block of the first region and a prediction block of the second region are obtained, wherein the prediction block of the first region and the prediction block of the second region are obtained by different inter prediction methods.
The first region may be a region adjacent to a reconstructed image region within an image to which the current block belongs, and the second region may be a region that is not in contact with the reconstructed image region within the image to which the current block belongs.
In the image decoding method and device according to the embodiment of the present invention, on the basis of the reconstructed image region within the image to which the current block belongs, and of a reference image of the current block, a motion vector of the first region may be estimated.
In the image decoding method and device according the embodiment of the present invention, a region positioned at a boundary as a region within the prediction block of the first region or a region positioned at a boundary as a region within the prediction block of the second region may be partitioned into multiple sub-blocks; motion information of a neighboring sub-block of a first sub-block, which is one of the multiple sub-blocks, may be used to generate a prediction block of the first sub-block; and the first sub-block and the prediction block of the first sub-block may be subjected to a weighted sum, so that a prediction block of the first sub-block to which the weighted sum is applied may be obtained.
In an image encoding method and device according to yet still another embodiment of the present invention, a current block to be encoded is partitioned into multiple regions including a first region and a second region; a prediction block of the first region and a prediction block of the second region are obtained, wherein the prediction block of the first region and the prediction block of the second region are obtained by different inter prediction methods.
The first region may be a region adjacent to a pre-encoded reconstructed image region within an image to which the current block belongs, and the second region may be a region that is not in contact with the pre-encoded reconstructed image region within the image to which the current block belongs.
In the image encoding method and device according to the embodiment of the present invention, on the basis of the pre-encoded reconstructed image region within the image to which the current block belongs, and of a reference image of the current block, a motion vector of the first region may be estimated.
In the image encoding method and device according to the embodiment of the present invention, a region positioned at a boundary as a region within the prediction block of the first region or a region positioned at a boundary as a region within the prediction block of the second region may be partitioned into multiple sub-blocks; motion information of a neighboring sub-block of a first sub-block, which is one of the multiple sub-blocks, may be used to generate a prediction block of the first sub-block; and the first sub-block and the prediction block of the first sub-block may be subjected to a weighted sum, so that a prediction block of the first sub-block to which the weighted sum is applied may be obtained.
In an image decoding method and device according to yet still another embodiment of the present invention, on the basis of blocks around a current block to be decoded, the current block is partitioned into multiple sub-blocks, and the multiple sub-blocks of the current block are decoded.
In the image decoding method and device according to the embodiment of the present invention, on the basis of a partitioning structure of neighboring blocks of the current block, the current block may be partitioned into the multiple sub-blocks.
In the image decoding method and device according to the embodiment of the present invention, on the basis of at least one among the number of the neighboring blocks, a size of the neighboring blocks, a shape of the neighboring blocks, and a boundary between the neighboring blocks, the current block may be partitioned into the multiple sub-blocks.
In the image decoding method and device according to the embodiment of the present invention, as a region neighbors the current block, a pre-reconstructed pixel region may be partitioned on a per-sub-block basis, and at least one of the multiple sub-blocks of the current block may be decoded using at least one sub-block included in the reconstructed pixel region.
In the image decoding method and device according to the embodiment of the present invention, on the basis of a partitioning structure of neighboring blocks of the current block, the reconstructed pixel region may be partitioned on a per-sub-block basis.
In the image decoding method and device according to the embodiment of the present invention, on the basis of at least one among the number of the neighboring blocks, a size of the neighboring blocks, a shape of the neighboring blocks, and a boundary between the neighboring blocks, the reconstructed pixel region may be partitioned on a per-sub-block basis.
In an image encoding method and device according to yet still another embodiment of the present invention, on the basis of blocks around a current block to be encoded, the current block may be partitioned into multiple sub-blocks, and the multiple sub-blocks of the current block may be encoded.
In the image encoding method and device according to the embodiment of the present invention, on the basis of a partitioning structure of neighboring blocks of the current block, the current block may be partitioned into the multiple sub-blocks.
In the image encoding method and device according to the embodiment of the present invention, on the basis of the at least one among the number of the neighboring blocks, a size of the neighboring blocks, a shape of the neighboring blocks, and a boundary between the neighboring blocks, the current block may be partitioned into the multiple sub-blocks.
In the image encoding method and device according to the embodiment of the present invention, as a region neighbors the current block, a pre-reconstructed pixel region may be partitioned on a per-sub-block basis, and at least one of the multiple sub-blocks of the current block may be encoded using at least one sub-block included in the reconstructed pixel region.
In the image encoding method and device according to the embodiment of the present invention, on the basis of a partitioning structure of neighboring blocks of the current block, the reconstructed pixel region may be partitioned on a per-sub-block basis.
In the image encoding method and device according to the embodiment of the present invention, on the basis of at least one among the number of the neighboring blocks, a size of the neighboring blocks, a shape of the neighboring blocks, and a boundary between the neighboring blocks, the reconstructed pixel region may be partitioned on a per-sub-block basis.
Advantageous EffectsAccording to the present invention, the amount of encoding information generated as a result of encoding a video may be reduced and thus encoding efficiency may be enhanced. Also, by adaptively decoding an encoded image, reconstruction efficiency of an image may be enhanced and the quality of the reproduced image may be improved.
Also, in inter prediction according to the present invention, the image encoding device does not need to transmit motion vector information to the image decoding device, so that the amount of encoding information may be reduced and thus encoding efficiency may be enhanced.
Also, according to the present invention, blocking artifacts may be reduced that may occur when one block is partitioned into multiple regions and encoding or decoding is performed using different types of inter prediction.
The present invention may be modified in various ways and implemented by various embodiments, so that specific embodiments are shown in the drawings and will be described in detail. However, the present invention is not limited thereto, and the exemplary embodiments can be construed as including all modifications, equivalents, or substitutes in a technical concept and a technical scope of the present invention. The similar reference numerals refer to the similar elements described in the drawings.
Terms “first”, “second”, etc. can be used to describe various elements, but the elements are not to be construed as being limited to the terms. The terms are only used to differentiate one element from other elements. For example, the “first” element may be named the “second” element without departing from the scope of the present invention, and similarly the “second” element may also be named the “first” element. The term “and/or” includes a combination of a plurality of items or any one of a plurality of terms.
It will be understood that when an element is referred to as being “coupled” or “connected” to another element, it can be directly coupled or connected to the other element or intervening elements may be present therebetween. In contrast, it will be understood that when an element is referred to as being “directly coupled” or “directly connected” to another element, there are no intervening elements present.
The terms used in the present specification are merely used to describe particular embodiments, and are not intended to limit the present invention. An expression used in the singular encompasses the expression of the plural, unless it has a clearly different meaning in the context. In the present specification, it will be understood that terms such as “including”, “having”, etc. are intended to indicate the existence of the features, numbers, steps, actions, elements, parts, or combinations thereof disclosed in the specification, and are not intended to preclude the possibility that one or more other features, numbers, steps, actions, elements, parts, or combinations thereof may exist or may be added.
Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. Hereinafter, the same elements in the drawings are denoted by the same reference numerals, and a repeated description of the same elements will be omitted.
Referring to
The constituents shown in
Also, some of constituents may not be indispensable constituents performing essential functions of the present invention but be selective constituents improving only performance thereof. The present invention may be implemented by including only the indispensable constituents for implementing the essence of the present invention except the constituents used in improving performance. The structure including only the indispensable constituents except the selective constituents used in improving only performance is also included in the scope of the present invention.
The image partitioning module 101 may partition an input image into one or more blocks. Here, the input image may have various shapes and sizes, such as a picture, a slice, a tile, a segment, and the like. A block may mean a coding unit (CU), a prediction unit (PU), or a transform unit (TU). The partitioning may be performed on the basis of at least one among a quadtree and a binary tree. Quadtree partitioning is a method of partitioning a parent block into four child blocks of which the width and the height are half of those of the parent block. Binary tree partitioning is a method of partitioning a parent block into two child blocks of which either the width or the height is half of that of the parent block. Through the above-described partitioning based on binary tree, a block may be in a square shape as well as a non-square shape.
Hereinafter, in the embodiment of the present invention, the coding unit may mean a unit of performing encoding or a unit of performing decoding.
The prediction modules 102 and 103 may include the intra prediction module 102 performing intra prediction and the inter prediction module 103 performing inter prediction. Whether to perform inter prediction or intra prediction on the prediction unit may be determined, and detailed information (for example, an intra prediction mode, a motion vector, a reference picture, and the like) depending on each prediction method may be determined. Here, a processing unit on which prediction is performed may be different from a processing unit in which the prediction method and the detailed content are determined. For example, the prediction method, the prediction mode, and the like may be determined on a per-prediction unit basis, and prediction may be performed on a per-transform unit basis.
A residual value (residual block) between the generated prediction block and an original block may be input to the transform module 105. Further, prediction mode information used for prediction, motion vector information, and the like may be encoded with the residual value by the entropy encoding module 107 and then may be transmitted to a decoder. When a particular encoding mode is used, the original block is intactly encoded and transmitted to the decoding module without generating a prediction block by the prediction module 102, 103.
The intra prediction module 102 may generate a prediction block on the basis of information on a reference pixel around the current block, which is information on a pixel within a current picture. When the prediction mode of the neighboring block of the current block on which intra prediction is to be performed is inter prediction, a reference pixel included in the neighboring block to which inter prediction has been applied is replaced by a reference pixel within another neighboring block to which intra prediction has been applied. That is, when the reference pixel is unavailable, at least one reference pixel among available reference pixels is used instead of information on the unavailable reference pixel.
Prediction modes in intra prediction may include a directional prediction mode using the information on the reference pixel depending on a prediction direction and a non-directional prediction mode not using directivity information in performing prediction. A mode for predicting luma information may be different from a mode for predicting chroma information, and in order to predict the chroma information, intra prediction mode information used to predict the luma information or predicted luma signal information may be utilized.
The intra prediction module 102 may include an adaptive intra smoothing (AIS) filter, a reference pixel interpolation module, and a DC filter. The AIS filter is a filter performing filtering on a reference pixel of the current block, and may adaptively determine whether to apply the filter depending on a prediction mode of a current prediction unit. When the prediction mode of the current block is a mode in which AIS filtering is not performed, the AIS filter is not applied.
When the intra prediction mode of the prediction unit is a prediction mode in which intra prediction is performed on the basis of a pixel value obtained by interpolating the reference pixel, the reference pixel interpolation module of the intra prediction module 102 interpolates the reference pixel to generate a reference pixel at a position on a per-fraction basis. When the prediction mode of the current prediction unit is a prediction mode in which the prediction block is generated without interpolating the reference pixel, the reference pixel is not interpolated. The DC filter generates the prediction block through filtering when the prediction mode of the current block is a DC mode.
The inter prediction module 103 generates the prediction block using a pre-reconstructed reference image stored in the memory 112 and motion information. The motion information may contain, for example, a motion vector, a reference picture index, a list 1 prediction flag, a list 0 prediction flag, and the like.
In the image encoding device, there are two typical methods of generating motion information.
The first method is a method in which motion information (a motion vector, a reference image index, an inter prediction direction, and the like) is generated using a motion estimation process.
The second method of generating the motion information is a method in which motion information of neighboring blocks of the current image block to be encoded is used.
One piece of motion information of the spatial candidate blocks A to E and the temporal candidate block COL, which neighbor the current block, may be selected as the motion information of the present block. Here, an index may be defined that indicates which block has the motion information used as the motion information of the current block. This index information also belongs to the motion information. In the image encoding device, using the above methods, the motion information may be generated and the prediction block may be generated through motion compensation.
A residual block may be generated that includes residual value information which is a difference value between the prediction unit generated by the prediction module 102, 103 and the original block of the prediction unit. The generated residual block may be input to the transform module 105 for transform.
The inter prediction module 103 may derive the prediction block on the basis of information on at least one picture among the previous picture and the subsequent picture of the current picture. Further, the prediction block of the current block may be derived on the basis of information on a partial region with encoding completed within the current picture. The inter prediction module 103 according to an embodiment of the present invention may include a reference picture interpolation module, a motion prediction module, and a motion compensation module.
The reference picture interpolation module may receive reference picture information from the memory and may generate information on a pixel equal to or smaller than an integer pixel in the reference picture. In the case of a luma pixel, a DCT-based 8-tap interpolation filter having different filter coefficients may be used to generate information on a pixel equal to or smaller than an integer pixel on a per-¼ pixel basis. In the case of a chroma signal, a DCT-based 4-tap interpolation filter having different filter coefficients may be used to generate information on a pixel equal to or smaller than an integer pixel on a per-⅛ pixel basis.
The motion prediction module may perform motion prediction on the basis of the reference picture interpolated by the reference picture interpolation module. As methods of calculating the motion vector, various methods, such as a full search-based block matching algorithm (FBMA), a three step search (TSS) algorithm, a new three-step search (NTS) algorithm, and the like, may be used. The motion vector may have a motion vector value on a per-½ or ¼ pixel basis on the basis of the interpolated pixel. The motion prediction module may predict the prediction block of the current block by using different motion prediction methods. As motion prediction methods, various methods, such as a skip method, a merge method, an advanced motion vector prediction (AMVP) method, and the like, may be used.
The subtractor 104 performs subtraction on the block to be currently encoded and on the prediction block generated by the intra prediction module 102 or the inter prediction module 103 so as to generate the residual block of the current block.
The transform module 105 may transform the residual block containing residual data, using a transform method, such as DCT, DST, Karhunen-Loeve transform (KLT), and the like. Here, the transform method may be determined on the basis of the intra prediction mode of the prediction unit that is used to generate the residual block. For example, depending on the intra prediction mode, DCT may be used in the horizontal direction, and DST may be used in the vertical direction.
The quantization module 106 may quantize values transformed into a frequency domain by the transform module 105. Quantization coefficients may vary according to a block or importance of an image. The value calculated by the quantization module 106 may be provided to the dequantization module 108 and the entropy encoding module 107.
The transform module 105 and/or the quantization module 106 may be selectively included in the image encoding device 100. That is, the image encoding device 100 may perform at least one of transform and quantization on residual data of the residual block, or may encode the residual block by skipping both transform and quantization. Even though the image encoding device 100 does not perform either transform or quantization or does not perform both transform and quantization, the block that is input to the entropy encoding module 107 is generally referred to as a transform block. The entropy encoding module 107 entropy encodes the input data. Entropy encoding may use various encoding methods, for example, exponential Golomb coding, context-adaptive variable length coding (CAVLC), and context-adaptive binary arithmetic coding (CABAC).
The entropy encoding module 107 may encode a variety of information, such as residual value coefficient information of a coding unit, block type information, prediction mode information, partitioning unit information, prediction unit information, transmission unit information, motion vector information, reference frame information, block interpolation information, filtering information, and the like, from the prediction module 102, 103. In the entropy encoding module 107, the coefficient of the transform block may be encoded on a per-partial block basis within the transform block on the basis of various types of flags indicating a non-zero coefficient, a coefficient of which the absolute value is higher than one or two, the sign of the coefficient, and the like may be encoded. A coefficient that is not encoded only with the flags may be encoded through the absolute value of the difference between the coefficient encoded through the flag and the actual coefficient of the transform block. The dequantization module 108 dequantizes the values quantized by the quantization module 106, and the inverse transform module 109 inversely transforms the values transformed by the transform module 105. The residual value generated by the dequantization module 108 and the inverse transform module 109 may be combined with the prediction unit predicted through a motion estimation module included in the prediction module 102, 103, the motion compensation module, and the intra prediction module 102 such that a reconstructed block is generated. The adder 110 adds the prediction block generated by the prediction module 102, 103 and the residual block generated by the inverse transform module 109 so as to generate a reconstructed block.
The filter module 111 may include at least one of a deblocking filter, an offset correction module, and an adaptive loop filter (ALF).
The deblocking filter may remove block distortion that occurs due to boundaries between the blocks in the reconstructed picture. In order to determine whether to perform deblocking, whether to apply the deblocking filter to the current block may be determined on the basis of the pixels included in several rows and columns in the block. When the deblocking filter is applied to the block, a strong filter or a weak filter is applied depending on required deblocking filtering strength. Further, in applying the deblocking filter, when performing horizontal direction filtering and vertical direction filtering, horizontal direction filtering and vertical direction filtering are configured to be processed in parallel.
The offset correction module may correct an offset from the original image on a per-pixel basis with respect to the image subjected to deblocking. In order to perform offset correction on a particular picture, it is possible to use a method of separating pixels of the image into the predetermined number of regions, determining a region to be subjected to offset, and applying the offset to the determined region, or a method of applying an offset considering edge information of each pixel.
Adaptive loop filtering (ALF) may be performed on the basis of the value obtained by comparing the filtered reconstructed image and the original image. The pixels included in the image may be divided into predetermined groups, a filter to be applied to each of the groups may be determined, and filtering may be individually performed on each group. Information on whether to apply ALF of a luma signal may be transmitted for each coding unit (CU), and the shape and the filter coefficient of the ALF filter to be applied may vary depending on each block. Also, regardless of the characteristic of the application target block, the ALF filter in the same form (fixed form) may be applied.
The memory 112 may store the reconstructed block or picture calculated through the filter module 111, and the stored reconstructed block or picture may be provided to the prediction module 102, 103 when performing inter prediction.
Next, an image decoding device according to an embodiment of the present invention will be described with reference to the accompanying drawings.
Referring to
When an image bitstream generated by the image encoding device 100 is input to the image decoding device 400, the input bitstream is decoded according to a reverse process of the process performed in the image encoding device 100.
The entropy decoding module 401 may perform entropy decoding according to the reverse procedure of the entropy encoding performed by the entropy encoding module 107 of the image encoding device 100. For example, corresponding to the methods performed by the image encoder, various methods, such as exponential Golomb coding, context-adaptive variable length coding (CAVLC), and context-adaptive binary arithmetic coding (CABAC), may be applied. In the entropy decoding module 401, the coefficient of the transform block may be encoded on a per-partial block basis within the transform block on the basis of various types of flags indicating a non-zero coefficient, a coefficient of which the absolute value is higher than one or two, the sign of the coefficient, and the like. A coefficient that is not represented only by the flags may be decoded through the sum of a coefficient represented through the flag and a signaled coefficient.
The entropy decoding module 401 may decode information related to intra prediction and inter prediction performed in the encoder. The dequantization module 402 performs dequantization on the quantized transform block to generate the transform block. This operates substantially in the same manner as the dequantization module 108 in
The inverse transform module 403 performs inverse transform on the transform block to generate the residual block. Here, the transform method may be determined on the basis of the prediction method (inter or intra prediction), the size and/or the shape of the block, information on the intra prediction mode, and the like. This operates substantially in the same manner as the inverse transform module 109 in
The adder 404 adds the prediction block generated by the intra prediction module 407 or the inter prediction module 408 and the residual block generated by the inverse transform module 403 so as to generate a reconstructed block. This operates substantially in the same manner as the adder 110 in
The filter module 405 reduces various types of noises occurring in the reconstructed blocks.
The filter module 405 may include a deblocking filter, an offset correction module, and an ALF.
From the image encoding device 100, it is possible to receive information on whether or not the deblocking filter is applied to the block or picture and information on whether the strong filter is applied or the weak filter is applied when the deblocking filter is applied. The deblocking filter of the image decoding device 400 may receive information related to the deblocking filter from the image encoding device 100, and the image decoding device 400 may perform deblocking filtering on the block.
The offset correction module may perform offset correction on the reconstructed image on the basis of the type of offset correction, offset value information, and the like applied to the image in performing encoding.
The ALF may be applied to the coding unit on the basis of information on whether to apply the ALF, ALF coefficient information, and the like received from the image encoding device 100. The ALF information may be provided by being included in a particular parameter set. The filter module 405 operates substantially in the same manner as the filter module 111 in
The memory 406 stores the reconstructed block generated by the adder 404. This operates substantially in the same manner as the memory 112 in
The prediction module 407, 408 may generate a prediction block on the basis of information related to prediction block generated provided from the entropy decoding module 401 and of information on the previously decoded block or picture provided from the memory 406.
The prediction modules 407 and 408 may include the intra prediction module 407 and the inter prediction module 408. Although not shown, the prediction module 407, 408 may further include a prediction unit determination module. The prediction unit determination module may receive a variety of information input from the entropy decoding module 401, such as prediction unit information, prediction mode information of an intra prediction method, information related to motion prediction of an inter prediction method, and the like, may separate a prediction unit in a current coding unit, and may determine whether inter prediction is performed on the prediction unit or intra prediction is performed on the prediction unit. By using information required in inter prediction of the current prediction unit provided from the image encoding device 100, the inter prediction module 408 may perform inter prediction on the current prediction unit on the basis of information included at least one picture among the previous picture and the subsequent picture of the current picture including the current prediction unit. Alternatively, inter prediction may be performed on the basis of information on some pre-reconstructed regions within the current picture including the current prediction unit.
In order to perform inter prediction, on the basis of the coding unit, it may be determined which mode among a skip mode, a merge mode, and an AMVP Mode is used for the motion prediction method of the prediction unit included in the coding unit.
The intra prediction module 407 generates the prediction block using the pre-reconstructed pixels positioned near the block to be currently encoded.
The intra prediction module 407 may include an adaptive intra smoothing (AIS) filter, a reference pixel interpolation module, and a DC filter. The AIS filter is a filter performing filtering on the reference pixel of the current block, and may adaptively determine whether to apply the filter depending on the prediction mode of the current prediction unit. The prediction mode of the prediction unit provided from the image encoding device 100 and the AIS filter information may be used to perform AIS filtering on the reference pixel of the current block. When the prediction mode of the current block is a mode in which AIS filtering is not performed, the AIS filter is not applied.
When the prediction mode of the prediction unit is a prediction mode in which intra prediction is performed on the basis of a pixel value obtained by interpolating the reference pixel, the reference pixel interpolation module of the intra prediction module 407 interpolates the reference pixel to generate a reference pixel at a position on a per-fraction basis. The generated reference pixel on a per-fraction basis may be used as a prediction pixel of a pixel within the current block. When the prediction mode of the current prediction unit is a prediction mode in which a prediction block is generated without interpolating the reference pixel, the reference pixel is not interpolated. The DC filter may generate a prediction block through filtering when the prediction mode of the current block is a DC mode.
The intra prediction module 407 operates substantially in the same manner as the intra prediction module 102 in
The inter prediction module 408 generates an inter prediction block using a reference picture stored in the memory 406 and motion information. The inter prediction module 408 operates substantially in the same manner as the inter prediction module 103 in
Hereinafter, various embodiments of the present invention will be described in detail with reference to the drawings.
First Exemplary EmbodimentBefore encoding or decoding of the current block, the reconstructed pixel region C 52 neighbors the current block 51, and thus the image encoding device 100 and the image decoding device 400 may use the same reconstructed pixel region C 52. Therefore, without encoding the motion information of the current block 51 by the image encoding device 100, the reconstructed pixel region C 52 is used such that the image encoding device 100 and the image decoding device 400 may generate the motion information of the current block 51 and the prediction block in the same manner.
Inter prediction according to the embodiment may be performed by the inter prediction module 103 of the image encoding device 100 or by the inter prediction module 408 of the image decoding device 400. Reference images used in inter prediction are stored in the memory 112 of the image encoding device 100 or in the memory 406 of the image decoding device 400. The inter prediction module 103 or the inter prediction module 408 may generate the prediction block of the current block 51 with reference to the reference image stored in the memory 112 or in the memory 406.
Referring to
The image encoding device 100 or the image decoding device 400 selects the motion vector 57 of the reconstructed pixel region C 52, determined as described above, as the motion vector of the current block 51 at step S65. Using this motion vector 57, the prediction block 58 of the current block 51 may be generated.
In the meantime, the reconstructed pixel region C 52 may be in various shapes and/or sizes.
Also, it is possible that the reconstructed pixel regions at the upper side and the left side of the current block are used as the reconstructed pixel region C 52 or that the two regions are combined into a single piece to be used as the reconstructed pixel region C 52. Also, it is possible that the reconstructed pixel region C 52 is used by being subjected to subsampling. In this method, only the decoded information around the current block is used to derive the motion information, and thus it is not necessary to transmit the motion information from the encoding device 100 to the decoding device 400.
According to the embodiment of the present invention, the decoding device 400 also performs motion estimation, so that if motion estimation is performed on the entire reference image, the complexity may extremely increase. Therefore, by transmitting the search range on a per-block basis or in the parent header or by fixing the search region to be the same in the encoding device 100 and in the decoding device 400, computational complexity of the decoding device 400 may be reduced.
Referring to
Afterward, cost_A is compared with cost_B to determine which method is optimum to use, at step S83. When cost_A is lower, it is set to perform inter prediction using the conventional method at step S84. Otherwise, it is set to perform inter prediction using the reconstructed pixel region at step S85.
Referring to
Alternatively, information indicating whether or not inter prediction using the reconstructed pixel region according to the embodiment of the present invention is used may be generated in the parent header first and then may be decoded. That is, when the information indicating whether or not inter prediction using the reconstructed pixel region is used indicates true, the DMVD indication information is encoded. When the information indicating whether or not inter prediction using the reconstructed pixel region is used indicates false, the DMVD indication information is not present within the bitstream and in this case, the current block is predicted using the conventional inter prediction.
In the meantime, regarding the parent header, the parent header including the information that indicates whether or not inter prediction using the reconstructed pixel region is used may be transmitted by being included in a block header, a slice header, a tile header, a picture header, or a sequence header.
The decoding device 400 decodes the DMVD indication information at step S101, decodes the motion information at step S102, and ends the algorithm.
In the case where the information indicating whether or not inter prediction using the reconstructed pixel region is used is present in the parent header of the bitstream, when the information indicating whether or not inter prediction using the reconstructed pixel region is used indicates true, the DMVD indication information is present in the bitstream. When the information indicating whether or not inter prediction using the reconstructed pixel region is used indicates false, the DMVD indication information is not present within the bitstream and in this case, the current block is predicted using the conventional inter prediction.
Regarding the parent header, the parent header including the information that indicates whether or not inter prediction using the reconstructed pixel region is used may be transmitted by being included in a block header, a slice header, a tile header, a picture header, or a sequence header.
Second Exemplary EmbodimentHereinafter, the second exemplary embodiment of the present invention will be described with reference to the drawings.
In the second exemplary embodiment, the inter prediction using the reconstructed pixel region according to the first exemplary embodiment described above is applied to inter prediction using affine transformation. Specifically, in order to derive a motion vector of a control point used for inter prediction using affine transformation, a motion vector derivation method using the reconstructed pixel region is applied. Hereinafter, for convenience of description, according to the second exemplary embodiment of the present invention, inter prediction using affine transformation is simply referred to as affine inter prediction.
In affine inter prediction, motion vectors at four corners of the current block to be encoded or decoded are obtained, and then the motion vectors are used to generate a prediction block. Here, the four corners of the current block may correspond to the control points.
Referring to
This affine inter prediction enables prediction of a block or image region subjected to rotation, zoom-in/zoom-out, translation, reflection, or shear deformation.
Equation 1 below is a general determinant of affine transformation.
Equation 1 is an equation representing transform of two-dimensional coordinates, wherein (x, y) denotes original coordinates, (x′, y′) denotes destination coordinates, and a, b, c, d, e, and f denote transform parameters.
In order to apply this affine transformation to video codec, transform parameters need to be transmitted to the image decoding device, which results in enormous increase in overhead. For this reason, in the conventional video codec, affine transformation is simply applied using N reconstructed neighboring control points.
Equation 2 below represents a method of deriving a motion vector of an arbitrary sub-block within the current block by using two control points at the top left and the top right of the current block.
In Equation 2, (x, y) denotes the position of the arbitrary sub-block within the current block, W denotes the horizontal length of the current block, (MVx, MVy) denotes the motion vector of the sub-block, (MV0x, MV0y) denotes the motion vector of the top left control point, and (MV1x, MV1y) denotes the motion vector of the top right control point.
Next, Equation 3 below represents a method of deriving a motion vector of an arbitrary sub-block within the current block by using three control points at the top left, the top right, and the bottom left of the current block.
In Equation 3, (x, y) denotes the position of the arbitrary sub-block, W and H denote the horizontal length and the vertical length of the current block, respectively, (MVx, MVy) denotes the motion vector of the sub-block within the current block, (MV0x, MV0y) denotes the motion vector of the top left control point, (MV1x, MV1y) denotes the motion vector of the top right control point, and (MV2x, MV2y) denotes the motion vector of the bottom left control point.
In the second exemplary embodiment of the present invention, in order to derive the motion vector of the control point used for affine inter prediction, the motion vector derivation method using the reconstructed pixel region according to the first exemplary embodiment is applied. Therefore, the image encoding device 100 does not need to transmit motion vector information of multiple control points to the image decoding device 400.
Referring to
In the embodiment, motion vectors of three control points at the top left, the top right, and the bottom left are derived using the reconstructed pixel regions a 12a-3, b 12a-4, and c 12a-5 as shown in
Referring to
By using the motion vectors of the four control points derived as described above, a motion vector of an arbitrary sub-block within the current block may be derived as shown in Equation 4 below.
In Equation 4, (x, y) denotes the position of the arbitrary sub-block within the current block, W and H denote the horizontal length and the vertical length of the current block, respectively, (MVx, MVy) denotes the motion vector of the sub-block within the current block, (MV0x, MV0y) denotes the motion vector of the top left control point, (MV1x, MV1y) denotes the motion vector of the top right control point, (MV2x, MV2y) denotes the motion vector of the bottom left control point, and (MV3x, MV3y) denotes the motion vector of the bottom right control point.
In the meantime, the reconstructed pixel regions a 12a-3, b 12a-4, and c 12a-5 may be in various sizes and/or shapes as described above with reference to
As described above, when motion vectors are derived from four control points, these vectors are used to derive the motion vector of the current block 12a-2 or the motion vector of an arbitrary sub-block within the current block 12a-2, and this derived motion vector may be used to derive the prediction block of the current block 12a-2 or the prediction block of an arbitrary sub-block within the current block 12a-2. Specifically, referring to Equation 4 above, the position of the current block 12a-2 is coordinates (0, 0), so that the motion vector of the current block 12a-2 is the motion vector (MV0x, MV0y) of the top left control point. Therefore, the prediction block of the current block 12a-2 may be obtained using the motion vector of the top left control point. When the current block is a 8×8 block and is partitioned into four 4×4 sub-blocks, the motion vector of the sub-block at the position (3,0) within the current block is obtained by substituting a value of three for the variable x in Equation 4 above, a value of zero for the variable y, and a value of eight for both variables W and H.
Next, with reference to
Inter prediction according to the embodiment may be performed by the inter prediction module 103 of the image encoding device 100 or the inter prediction module 408 of the image decoding device 400. Reference images used in inter prediction are stored in the memory 112 of the image encoding device 100 or in the memory 406 of the image decoding device 400. The inter prediction module 103 or the inter prediction module 408 may generate the prediction block of the current block 51 with reference to a reference image stored in the memory 112 or the memory 406.
Referring to
Next, on the basis of the at least one reconstructed pixel region selected at step S131 and the reference image of the current block, a motion vector of at least one reconstructed pixel region is derived at step S133. The image encoding device 100 or the image decoding device 400 selects each motion vector of the reconstructed pixel region C 52, determined as described above, as a motion vector of at least one control point of the current block at step S135. At least one motion vector selected as described above may be used to generate the prediction block of the current block.
Referring to
Afterward, cost_A is compared with cost_B to determine which method is optimum to use, at step S143. When cost_A is lower, it is set to perform inter prediction using the conventional method at step S144. Otherwise, it is set to perform affine inter prediction at step S145 according to the second exemplary embodiment of the present invention.
Referring to
In the meantime, according to the second exemplary embodiment of the present invention, the information indicating whether or not affine inter prediction is used may be generated in the parent header first and then may be encoded. That is, according to the second exemplary embodiment of the present invention, when the information indicating whether or not affine inter prediction is used indicates true, the DCMVD indication information is encoded. According to the second exemplary embodiment of the present invention, when the information indicating whether or not affine inter prediction is used indicates false, the DCMVD indication information is not present within the bitstream, and in this case, the current block is predicted using the conventional inter prediction.
In the meantime, regarding the parent header, the parent header including the information indicating whether or not affine inter prediction according to the present invention is used may be transmitted by being included in a block header, a slice header, a tile header, a picture header, or a sequence header.
The decoding device 400 decodes the DCMVD indication information at step S161, decodes the motion information at step S162, and ends the algorithm.
In the case where the information indicating whether or not affine inter prediction according to the second exemplary embodiment of the present invention is used is present in the parent header of the bitstream, when the information indicating whether or not inter prediction using the reconstructed pixel region is used indicates true, the DCMVD indication information is present in the bitstream. According to the second exemplary embodiment of the present invention, when the information indicating whether or not affine inter prediction is used indicates false, the DCMVD indication information is not present within the bitstream, and in this case, the current block is predicted using the conventional inter prediction.
According to the second exemplary embodiment of the present invention, regarding the parent header, the parent header including the information indicating whether or not affine inter prediction is used may be transmitted by being included in a block header, a slice header, a tile header, a picture header, or a sequence header.
To derive motion vectors of three control points at the top left, the top right, and the bottom left of the current block 12a-2, three reconstructed pixel regions a 12a-3, b 12a-4, and c 12a-5 are selected. However, without being limited thereto, to derive a motion vector of one or two control points among the three control points, one or two reconstructed pixel regions may be selected.
The image decoding device 400 may determine, on the basis of the DCMVD indication information, which inter prediction is to be performed. When the DCMVD indication information indicates use of affine inter prediction according to the present invention at step S171, the motion vectors of the control points at the top left, the top right, and the bottom left of the current block are estimated and selected using the respective reconstructed pixel regions at step S172.
Afterward, the motion vector obtained by decoding the transmitted motion information in the bitstream is set to be the motion vector of the control point at the bottom right at step S173. Using affine transformation in which the motion vectors of the four control points derived through steps S172 and S173 are used, an inter prediction block of the current block is generated at step S174. When affine inter prediction is not used, the prediction block of the current block is generated at step S175 according to the conventional inter prediction in which the motion information is decoded and the decoded motion information is used.
Third Exemplary EmbodimentDue to correlation between pixels, the pixels within the reconstructed pixel region C 503 is likely to be similar to the pixels included in the region A 500-a, but is unlikely to be similar to the pixels included in the region B 500-b. Therefore, in inter prediction on the region A 500-a, motion estimation and motion compensation using the reconstructed pixel region C 503 are performed to find accurate motion while preventing increase in overhead. In the meantime, as the inter prediction method for the region B 500-b, the conventional inter prediction may be applied.
Inter prediction according to the embodiment may be performed by the inter prediction module 103 of the image encoding device 100 or by the inter prediction module 408 of the image decoding device 400. The reference images used in inter prediction are stored in the memory 112 of the image encoding device 100 or the memory 406 of the image decoding device 400. The inter prediction module 103 or the inter prediction module 408 may generate, with reference to the reference image stored in the memory 112 or the memory 406, the prediction block of the region A 500-a and the prediction block of the region B 500-b within the current block.
First, as shown in
Next, using different inter prediction methods, a prediction block of the first region and a prediction block of the second region are obtained at step S53. Here, the inter prediction method for the region A 500-a may be, as described above, the method in which motion estimation and motion compensation using the reconstructed pixel region C 503 are performed. As the inter prediction method for the region B 500-b, the conventional inter prediction may be applied.
As in the embodiment, a method in which the current block is partitioned into multiple regions and the prediction blocks of the respective regions are derived using different inter prediction methods is referred to as a mixed inter prediction.
Referring to
That is, the motion vector 605 estimated using the reconstructed pixel region C 503 is selected as the motion vector of the region A 500-a of the current block. Using the motion vector 605, the prediction block of the region A 500-a is generated.
In the meantime, as shown in
According to the embodiment of the present invention, the decoding device 400 also performs motion estimation, so that if motion estimation is performed on the entire reference image, the complexity may extremely increase. Therefore, by transmitting the search range on a per-block basis or in the parent header or by fixing the search range to be the same in the encoding device 100 and in the decoding device 400, computational complexity of the decoding device 400 may be reduced.
In the meantime, when estimating and encoding the motion vector of the region B 500-b shown in
Alternatively, it is possible that the motion vector 605 estimated as the motion vector of the region A 500-a is used to predict the motion vector of the region B 500-b and the residual vector is encoded.
Alternatively, it is possible that the motion vector of the decoded block within the reconstructed pixel region C 503 and the estimated motion vector 605 of the region A 500-a are used to constitute a motion vector prediction set, the motion vector of the region B 500-b is predicted, and the residual vector is encoded.
Alternatively, it is possible that among the blocks adjacent to the current block, motion information is taken from a preset position to perform block merging. Here, block merging means that neighboring motion information is intactly applied to a block to be encoded. Here, it is also possible that after setting several preset positions, an index indicating at which position block merging is performed is used.
Further, possibly, the size of the region B 500-b is encoded by the encoding device 100 to be transmitted to the decoder on a per-block basis or through the parent header, or uses the same preset value or ratio in the encoding device 100 and the decoding device 400.
Referring to
Afterward, cost_A and cost_B are computed to determine which method is optimum to use, at step S803. When cost_A is lower, it is set to perform the inter prediction using the conventional method at step S804. Otherwise, it is set to perform the mixed inter prediction at step S805.
Information indicating which type of inter prediction has been used for the block to be currently encoded is encoded at step S901. This information may be, for example, a 1-bit flag or one of several indexes. Afterward, the motion information is encoded at step S902, and the algorithm ends.
Alternatively, the information indicating whether or not the mixed inter prediction according to the embodiment of the present invention is used may be generated in the parent header first and then may be encoded. That is, when in the parent header, the information indicating whether or not the mixed inter prediction is used indicates true, the information indicating which type of inter prediction has been used for the block to be currently encoded is encoded. When the information indicating whether or not the mixed inter prediction is used indicates false, the information indicating which type of inter prediction has been used is not present within the bitstream, and in this case, the current block is not partitioned into multiple regions and the current block is predicted using the conventional inter prediction.
In the meantime, regarding the parent header, the parent header including the information indicating whether or not the mixed inter prediction is used may be transmitted by being included in a block header, a slice header, a tile header, a picture header, or a sequence header.
The decoding device 400 decodes the information indicating which type of inter prediction has been used for the block to be currently encoded, at step S1001, decodes the motion information at step S1002, and ends the algorithm.
In the case where the information indicating whether or not the mixed inter prediction is used is present in the parent header of the bitstream, when the information indicating whether or not the mixed inter prediction is used indicates true, the information indicating which type of inter prediction has been used for the block to be currently encoded is present in the bitstream. When the information indicating whether or not the mixed inter prediction is used indicates false, the information indicating which type of inter prediction has been used is not present within the bitstream, and in this case, the current block is not partitioned into multiple regions and the current block is predicted using the conventional inter prediction.
The parent header including the information indicating whether or not the mixed inter prediction is used may be a block header, a slice header, a tile header, a picture header, or a sequence header. In the parent header, the information indicating whether or not the mixed inter prediction is used may be transmitted by being included in a block header, a slice header, a tile header, a picture header, or a sequence header.
First, it is determined at step S1101 whether or not the information indicating which type of inter prediction has been used indicates use of the mixed inter prediction. When the mixed inter prediction is used for the current block to be decoded, the current block is partitioned into multiple regions at step S1102. For example, the current block may be partitioned into the region A 500-a and the region B 500-b as shown in
Here, it is possible that that the size of each region resulting from the partitioning is signaled from the encoding device 100 to the decoding device 400 on a per-block basis or through the parent header or is set to a preset value.
Afterward, according to the method shown in
Next, regarding a second region, for example, the region B 500-b, the decoded motion vector is used to generate a prediction block at step S1104, and the algorithm ends.
When the information indicating which type of inter prediction has been used indicates that the mixed inter prediction is not used or when the information, included in the parent header, indicating whether or not the mixed inter prediction is used indicates false, the conventional inter prediction is applied as the prediction method of the current block 500. That is, the decoded motion information is used to generate the prediction block of the current block 500 at step S1105, and the algorithm ends. The size of the prediction block is the same as the size of the current block 500 to be decoded.
Fourth Exemplary EmbodimentHereinafter, the fourth exemplary embodiment of the present invention will be described with reference to the drawings. The fourth exemplary embodiment relates to a method to reduce blocking artifacts that may occur at the boundary of the block when the mixed inter prediction according to the third exemplary embodiment is performed.
To summarize the fourth exemplary embodiment of the present invention, first, the regions positioned at the boundaries of the prediction block are partitioned into sub-blocks in a predetermined size. Afterward, the motion information of the sub-block around the sub-block of the prediction block is applied to the sub-block of the prediction block so that a new prediction block is generated. Afterward, a weighted sum of the sub-block of the prediction block and the new prediction block is obtained so that the final sub-block of the prediction block is generated. This is referred to as overlapped block motion compensation (OBMC).
Referring to
For convenience of description, it is assumed that the size of each sub-block shown in
For convenience of description, although the horizontal and vertical lengths of each sub-block are assumed to be four, other various values may be encoded on a per-block basis or through the parent header and may be then signaled to the decoding device 400. Accordingly, the encoding device 100 and the decoding device 400 may set the size of the sub-block to be the same. Alternatively, it is possible that the encoding device 100 and the decoding device 400 use sub-blocks in a preset same size.
Referring to
c=W1×a+(1−W1)×b [Equation 5]
In addition to the prediction pixel c, the remaining 15 pixels may be computed in a manner similar to the above. P2 to P8 in
Referring to
In the case of the sub-block P1 in
Also in the case of the sub-blocks P16 to P22 present in the prediction block 2 shown in
In the meantime, not only the pixel values of the sub-blocks P16 to P22 are replaced, but also the pixel values of the neighboring sub-blocks C1 to C4, D1 to D4 may be replaced by new values through the weighted sum calculation. For example, in the case of the sub-block C2, the motion information of the sub-block P17 is applied to the sub-block C2 to generate a prediction sub-block, and then the pixel values within the prediction sub-block and the pixel values of the sub-block C2 are subjected to the weighted sum so that the pixel values of the sub-block C2 to which the weighted sum is applied is generated.
The variable BEST_COST storing the optimum cost is initialized to the maximum value, COMBINE_MODE storing whether or not the mixed inter prediction is used is initialized to false, and WEIGHTED_SUM storing whether or not the weighted sum is used between sub-blocks is initialized to false at step S1501. Afterward, inter prediction using the conventional method is performed, and then cost_A is computed at step S1502. The mixed inter prediction is performed, and then cost_B is computed at step S1503. After comparing the two costs at step S1504, when the value of cost_A is lower, COMBINE_MODE is set to false to indicate that the mixed inter prediction is not used and BEST_COST stores cost_A at step S1505.
When the value of cost_B is lower, COMBINE_MODE is set to true to indicate that the mixed inter prediction is used and BEST_COST stores cost_B at step S1506. Afterward, the weighted sum is applied between the sub-blocks and cost_C is computed at step S1507. After comparing BEST_COST with cost_C at step S1508, when BEST_COST is lower than cost_C, the variable WEIGHTED_SUM is set to false to indicate that the weighted sum is not applied between the sub-blocks at step S1509. Otherwise, the variable WEIGHTED_SUM is set to true to indicate that the weighted sum is applied between the sub-blocks at step S1510 and the algorithm ends.
When the information indicating whether or not the mixed inter prediction is used is present in the parent header of the bitstream, and when the information indicating whether or not the mixed inter prediction is used indicates true, the information indicating whether or not the weighted sum is applied between the sub-blocks is encoded and then included in the bitstream. However, when the information, included in the parent header, indicating whether or not the mixed inter prediction is used indicates false, the information indicating whether or not the weighted sum is applied between the sub-blocks is not present within the bitstream.
When the information indicating whether or not the mixed inter prediction is used is present in the parent header of the bitstream, and when the information indicating whether or not the mixed inter prediction is used indicates true, the information indicating whether or not the weighted sum is applied between the sub-blocks is encoded and then included in the bitstream.
However, when the information, included in the parent header, indicating whether or not the mixed inter prediction is used indicates false, the information indicating whether or not the weighted sum is applied between the sub-blocks is not present within the bitstream. In this case, it may be inferred that the information indicating whether or not the weighted sum is applied between the sub-blocks indicates that the weighted sum is not applied between the sub-blocks.
Fifth Exemplary EmbodimentBefore encoding or decoding of the current block, the reconstructed pixel region C 251 neighbors the current block 252, and thus the image encoding device 100 and the image decoding device 400 may use the same reconstructed pixel region C 251. Therefore, without encoding the motion information of the current block 252 by the image encoding device 100, the reconstructed pixel region C 251 is used such that the image encoding device 100 and the image decoding device 400 may generate the motion information of the current block 252 and the prediction block in the same manner.
The sub-blocks A to D may be in an arbitrary size. MV_A to MV_D shown in
The size of each sub-block may be encoded on a per-block basis or through the parent header and may be transmitted to the decoding device 400. Alternatively, it is possible that the encoding device 100 and the decoding device 400 use the same preset size value of the sub-block.
In the meantime, as shown in
Here, for convenience of description, the description is given assuming that the reconstructed pixel region C 251 as shown in
As the reconstructed pixel regions for the sub-block A 281, the sub-blocks a 285 and c 287 may be used. As the reconstructed pixel regions for the sub-block B 282, the sub-blocks b 286 and c 287 may be used. As the reconstructed pixel regions for the sub-block C 283, the sub-blocks a 285 and d 288 may be used. As the reconstructed pixel regions for the sub-block D 284, the sub-blocks b 286 and d 288 may be used.
According to the embodiment of the present invention, the reconstructed neighboring pixel region used for prediction of the current block may be partitioned on the basis of a partitioning structure of reconstructed neighboring blocks. In other words, on the basis of at least one among the number of the reconstructed neighboring blocks, the sizes of the reconstructed neighboring blocks, the shapes of the reconstructed neighboring blocks, and the boundaries between the reconstructed neighboring blocks, the reconstructed pixel region may be partitioned.
Referring to
Specifically, the number of reconstructed neighboring blocks may be considered in partitioning of the reconstructed pixel region. Referring to
Alternatively, the sizes of the reconstructed neighboring blocks may be considered in partitioning of the reconstructed pixel region. For example, the height of the sub-block c of the reconstructed pixel region at the left side of the current block 2100 is the same as that of the reconstructed block 3 2103. The height of the sub-block d is the same as that of the reconstructed block 4 2104. The height of the sub-block e corresponds to a value obtained by subtracting the height of the sub-block c and the height of the sub-block d from the height of the current block 2100.
Alternatively, the boundaries between the reconstructed neighboring blocks may be considered in partitioning of the reconstructed pixel region. Considering the boundary between the reconstructed block 1 2101 and the reconstructed block 2 2102 at the upper side of the current block 2100, the reconstructed pixel region at the upper side of the current block 2100 is partitioned into two sub-blocks, the sub-blocks a and b. Considering the boundary between the reconstructed block 3 2103 and the reconstructed block 4 2104 and the boundary between the reconstructed block 4 2104 and the reconstructed block 5 2105 at the left side of the current block 2100, the reconstructed pixel region at the left side of the current block 2100 is partitioned into three sub-blocks, the sub-blocks c to e.
In the meantime, there may be various conditions with respect to which region of the sub-blocks a to e is used to perform motion estimation. For example, it is possible that motion estimation is performed using only one reconstructed pixel region having the largest area, or it is possible that m reconstructed pixel regions from the top and n reconstructed pixel regions from the left side are selected according to the priority and used for motion estimation. Alternatively, it is also possible that a filter such as a low-pass filter is applied between the sub-blocks a to e to relieve the dramatic difference in pixel values and then one reconstructed pixel region 251 as shown in
The method of partitioning the current block shown in
The current block shown in
Alternatively, it is possible that priority is set depending on the sizes of the sub-blocks and the reconstructed pixel regions. For example, in the case of the sub-block A shown in
Referring to
Next, multiple sub-blocks within the current block are encoded or decoded at step S2203. According to the embodiment of the present invention, as described above, each of the sub-blocks A to F of the current block shown in
The method shown in
Referring to
Next, using at least one sub-block included in the reconstructed pixel region, at least one among the multiple sub-blocks within the current block is encoded or decoded at step S2213. For example, as described above referring to
The method shown in
First, two variables used in this method, DMVD indication information and SUB_BLOCK will be described. The decoder-side motion vector derivation (DMVD) indication information or decoder-side motion vector derivation indication information is information indicating whether the inter prediction using the conventional method is performed or the above-described inter prediction using the reconstructed pixel region according to the present invention is performed. When the DMVD indication information indicates false, it indicates that the inter prediction using the conventional method is performed. When the DMVD indication information indicates true, it indicates that the inter prediction using the reconstructed pixel region according to the present invention is performed.
The variable SUB_BLOCK indicates whether or not the current block is partitioned into sub-blocks. When the value of SUB_BLOCK indicates false, it indicates that the current block is not partitioned into sub-blocks. Conversely, when the value of SUB_BLOCK indicates true, it indicates that the current block is partitioned into sub-blocks.
Referring to
Afterward, SUB_BLOCK is set to true and inter prediction is performed, and then cost_2 is computed at step S2302. Next, the DMVD indication information is set to true and SUB_BLOCK is set to false, and then inter prediction is performed and cost_3 is computed at step S2303. Last, the DMVD indication information and SUB_BLOCK are set to true, and then inter prediction is performed and cost_4 is computed at step S2304. The calculated cost_1 to cost_4 are compared with each other, and then the optimum inter prediction method is determined. The DMVD indication information and the SUB_BLOCK information related to the determined optimum inter prediction method are stored, and then the algorithm ends.
In
After step S2401, SUB_BLOCK, the information indicating whether or not the current block is partitioned into sub-blocks, is encoded at step S2402. Whether or not the current block is partitioned into the sub-blocks is determined at step S2403, and when the partitioning into the sub-blocks is not performed, the value of the variable BLOCK_NUM is changed into one at step S2404.
Afterward, the DMVD indication information indicating whether or not the inter prediction using the reconstructed pixel region has been used is encoded at step S2405. Whether or not the inter prediction using the reconstructed pixel region has been used is determined at step S2406, and when the inter prediction using the reconstructed pixel region has not been used, the motion information is encoded at step S2407. Conversely, when the inter prediction using the reconstructed pixel region has been used, the value of BLOCK_INDEX is increased at step S2408 and is compared with the variable BLOCK_NUM at step S2409. When the value of BLOCK_INDEX is the same as the value of BLOCK_NUM, this means that there is no more sub-block to be encoded in the current block, so that the algorithm ends. When the two values differ, proceeding to the subsequent sub-block to be encoded, which is present within the current block, takes place and then the process repeats from step S2406.
After step S2501, SUB_BLOCK, the information indicating whether or not the current block is partitioned into sub-block, is decoded at step S2502. Whether or not the current block is partitioned into the sub-blocks is determined at step S2403, and when the partitioning into the sub-blocks is not performed, the value of the variable BLOCK_NUM is changed into one at step S2404.
Afterward, the DMVD indication information indicating whether or not the inter prediction using the reconstructed pixel region has been used is decoded at step S2505. Whether or not the inter prediction using the reconstructed pixel region has been used is determined at step S2506, and when the inter prediction using the reconstructed pixel region has not been used, the motion information is decoded at step S2507. Conversely, when the inter prediction using the reconstructed pixel region has been used, the value of BLOCK_INDEX is increased at step S2508 and is compared with the variable BLOCK_NUM at step S2509. When the value of BLOCK_INDEX is the same as the value of BLOCK_NUM, this means that there is no more sub-block to be decoded in the current block, so that the algorithm ends. When the two values differ, proceeding to the subsequent sub-block to be decoded, which is present within the current block, takes place and then the process repeats from step S2506.
Sixth Exemplary EmbodimentHereinafter, the sixth exemplary embodiment of the present invention will be described with reference to the drawings.
As shown in
Here, the sub-blocks F, G, H, and I are spaced apart from the reconstructed pixel region rather than being in contact therewith, so that inter prediction using the reconstructed pixel region may be inaccurate. Therefore, in the case of the sub-blocks F, G, H, and I, the conventional inter prediction is performed, and only in the case of the sub-blocks A to E, inter prediction using the reconstructed pixel region may be used.
When inter prediction using reconstructed pixel region is performed on the sub-blocks A to E, inter prediction is performed using the reconstructed pixel region adjacent to each sub-block. For example, inter prediction may be performed using reconstructed pixel region b for the sub-block B, using reconstructed pixel region c for the sub-block C, using reconstructed pixel region e for the sub-block D, and using reconstructed pixel region f for the sub-block E. In the case of the sub-block A, according to preset priority, inter prediction may be performed using either the reconstructed pixel region a or d, or using the reconstructed pixel regions a and d.
Alternatively, possibly, an index indicating which reconstructed pixel region is used for each sub-block when inter prediction using the reconstructed pixel region is performed on the sub-blocks A to E, is encoded. For example, among the reconstructed pixel regions a to f, the reconstructed pixel region b may be used to perform inter prediction on the sub-block A. In the case of the sub-block E, inter prediction may be performed using the reconstructed pixel region c. In this case, according to the horizontal or vertical size of each of the reconstructed pixel regions a to f, the number of, the positions of pixels in each region, and the like, the priority is determined and indexes are assigned.
In the case of the sub-blocks F to I, encoding or decoding may be possible by performing the conventional inter prediction. Alternatively, as shown in
Further, a case in which the index indicating which sub-reconstructed region is used when performing inter prediction using the reconstructed pixel region is encoded, will be described as an example. As an example, it will be described that the sub-block F is encoded or decoded by performing the conventional inter prediction. The description is given assuming that among the sub-blocks within the current block, the sub-block F is encoded or decoded last.
Referring to
After step S2801, SUB_BLOCK, the information indicating whether or not the current block is partitioned into sub-blocks, is encoded at step S2802. Whether or not the current block is partitioned into the sub-blocks is determined at step S2803, and when the partitioning into the sub-blocks is not performed, the value of the variable BLOCK_NUM is changed into one at step S2804.
Step S2805, at which the value of BLOCK_INDEX is compared with the value of BLOCK_NUM−1, is the step of determining whether the conventional inter prediction is used for the block or the inter prediction using the reconstructed pixel region is used for the block. When the two values are the same, it is the last block, namely, the sub-block subjected to the conventional inter prediction, so that the motion information is encoded at step S2806. Otherwise, it is the sub-block subjected to the inter prediction using the reconstructed pixel region, so that the index indicating which sub-reconstructed region is used is encoded at step S2807. Alternatively, it is possible that this step is skipped and the same reconstructed region determined in the encoding device and the decoding device is used.
Afterward, the index of the sub-block is increased at step S2808, and BLOCK_NUM is compared with BLOCK_INDEX to determine whether or not encoding of all the sub-blocks present in the current block is completed, at step S2809. If not, proceeding to step S2805 takes place and the algorithm continues.
After step S2901, SUB_BLOCK, the information indicating whether or not the current block is partitioned into sub-blocks, is decoded at step S2902. Whether or not the current block is partitioned into the sub-blocks is determined at step S2903, and when the partitioning into the sub-blocks is not performed, the value of the variable BLOCK_NUM is changed into one at step S2904.
Step S2905, at which the value of BLOCK_INDEX is compared with the value of BLOCK_NUM−1, is the step of determining whether the conventional inter prediction is used for the block or the inter prediction using the reconstructed pixel region is used for the block. When the two values are the same, it is the last block, namely, the sub-block subjected to the conventional inter prediction, so that the motion information is decoded at step S2906. Otherwise, it is the sub-block subjected to the inter prediction using the reconstructed pixel region, so that the index indicating which sub-reconstructed region is used is decoded at step S2907. Alternatively, it is possible that this step is skipped and the same reconstructed region determined in the encoding device and the decoding device is used. Afterward, the index of the sub-block is increased at step S2908, and BLOCK_NUM is compared with BLOCK_INDEX to determine whether or not decoding of all the sub-blocks present in the current block is completed, at step S2909. If not, proceeding to step S2905 takes place and the algorithm continues.
Although the exemplary methods described in the present invention are represented as a series of operations for clarity of description, the order of the steps is not limited thereto. When necessary, the steps may be performed simultaneously or in a different order. In order to realize the method according to the present invention, other steps may be added to the illustrative steps, some steps may be excluded from the illustrative steps, or some steps may be excluded while additional steps may be included.
The various embodiments of the present invention are not intended to list all possible combinations, but to illustrate representative aspects of the present invention. The matters described in the various embodiments may be applied independently or in a combination of two or more.
Further, the various embodiments of the present invention may be implemented by hardware, firmware, software, or combinations thereof. In the case of implementation by hardware, implementation is possible by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), general processors, controllers, micro controllers, microprocessors, or the like.
The scope of the present invention includes software or machine-executable instructions (for example, an operating system, an application, firmware, a program, or the like) that cause operation according to the methods of the various embodiments to be performed on a device or a computer, and includes a non-transitory computer-readable medium storing such software or instructions to execute on a device or a computer.
INDUSTRIAL APPLICABILITYThe present invention is applicable to a field of encoding or decoding an image signal.
Claims
1. An image decoding method comprising:
- selecting a reconstructed pixel region within an image to which a current block to be decoded belongs;
- deriving, on the basis of the reconstructed pixel region and a reference image of the current block, a motion vector of the reconstructed pixel region; and
- selecting the derived motion vector as a motion vector of the current block.
2. The image decoding method of claim 1, wherein the deriving of the motion vector of the reconstructed pixel region includes:
- determining a region corresponding to the reconstructed pixel region, within the reference image; and
- deriving, on the basis of a position of the determined region corresponding to the reconstructed pixel region, the motion vector of the reconstructed pixel region.
3. The image decoding method of claim 1, further comprising:
- decoding decoder-side motion vector derivation indication information,
- wherein on the basis of the decoder-side motion vector derivation indication information, the motion vector of the reconstructed pixel region is derived.
4. An image encoding method comprising:
- selecting a reconstructed pixel region within an image to which a current block to be encoded belongs;
- deriving, on the basis of the reconstructed pixel region and a reference image of the current block, a motion vector of the reconstructed pixel region; and
- selecting the derived motion vector as a motion vector of the current block.
5. The image encoding method of claim 4, wherein the deriving of the motion vector of the reconstructed pixel region includes:
- determining a region corresponding to the reconstructed pixel region, within the reference image; and
- deriving, on the basis of a position of the determined region corresponding to the reconstructed pixel region, the motion vector of the reconstructed pixel region.
6. The image encoding method of claim 5, further comprising:
- encoding decoder-side motion vector derivation indication information,
- wherein the decoder-side motion vector derivation indication information indicates whether or not the derived motion vector of the reconstructed pixel region is selected as the motion vector of the current block.
7. An image decoding method comprising:
- selecting at least one reconstructed pixel region within an image to which a current block to be decoded using affine inter prediction belongs;
- deriving, on the basis of the at least one reconstructed pixel region and a reference image of the current block, a motion vector of the at least one reconstructed pixel region; and
- selecting the derived motion vector of the at least one reconstructed pixel region as a motion vector of at least one control point of the current block.
8. The image decoding method of claim 7, wherein the at least one reconstructed pixel region is a region adjacent to the at least one control point of the current block.
9. The image decoding method of claim 7, wherein the deriving of the motion vector of the at least one reconstructed pixel region includes:
- determining a region corresponding to the at least one reconstructed pixel region, within the reference image; and
- deriving, on the basis of a position of the determined region corresponding to the at least one reconstructed pixel region, the motion vector of the at least one reconstructed pixel region.
10. An image decoding method comprising:
- partitioning a current block to be decoded into multiple regions including a first region and a second region;
- obtaining a prediction block of the first region; and
- obtaining a prediction block of the second region,
- wherein the prediction block of the first region and the prediction block of the second region are obtained by different inter prediction methods.
11. The image decoding method of claim 10, wherein the first region is a region adjacent to a reconstructed image region within an image to which the current block belongs, and the second region is a region that is not in contact with the reconstructed image region within the image to which the current block belongs.
12. The image decoding method of claim 10, further comprising:
- decoding information that indicates which type of inter prediction is used,
- wherein when the information indicates to derive the prediction block of the first region and the prediction block of the second region by using the different inter prediction methods, the prediction block of the first region and the prediction block of the second region are derived using the different inter prediction methods.
13. An image encoding method comprising:
- partitioning a current block to be encoded into multiple regions including a first region and a second region;
- obtaining a prediction block of the first region; and
- obtaining a prediction block of the second region,
- wherein the prediction block of the first region and the prediction block of the second region are obtained by different inter prediction methods.
14. The image encoding method of claim 13, wherein the first region is a region adjacent to a pre-reconstructed image region within an image to which the current block belongs, and the second region is a region that is not in contact with the pre-reconstructed image region within the image to which the current block belongs.
15. The image encoding method of claim 13, further comprising:
- encoding information that indicates which type of inter prediction is used,
- wherein the information is information indicating whether or not the prediction block of the first region and the prediction block of the second region are derived using the different inter prediction methods.
16. An image decoding method comprising:
- partitioning, on the basis of blocks around a current block to be decoded, the current block into multiple sub-blocks; and
- decoding the multiple sub-blocks of the current block.
17. The image decoding method of claim 16, wherein the partitioning of the current block into the multiple sub-blocks is performed on the basis of a partitioning structure of neighboring blocks of the current block.
18. The image decoding method of claim 16, wherein the partitioning of the current block into the multiple sub-blocks is performed on the basis of at least one among the number of neighboring blocks, a size of the neighboring blocks, a shape of the neighboring blocks, and a boundary between the neighboring blocks.
19. The image decoding method of claim 16, further comprising:
- partitioning a pre-reconstructed pixel region, which is a region neighbors the current block, on a per-sub-block basis,
- wherein at the decoding of the multiple sub-blocks of the current block, at least one of the multiple sub-blocks of the current block is decoded using at least one sub-block included in the reconstructed pixel region.
20. The image decoding method of claim 16, further comprising:
- decoding information that indicates whether or not partitioning into sub-blocks is performed,
- wherein the partitioning of the current block into the multiple sub-blocks is performed on the basis of the information that indicates whether or not partitioning into the sub-block is performed.
Type: Application
Filed: Jan 16, 2018
Publication Date: Nov 28, 2019
Applicant: INDUSTRY ACADEMY COOPERATION FOUNDATION OF SEJONG UNIVERSITY (Seoul)
Inventors: Joo Hee MOON (Seoul), Sung Won LIM (Seoul), Dong Jae WON (Goyang-si)
Application Number: 16/478,259