METHOD, MEDIUM, AND APPARATUS ENCODING AND/OR DECODING AN IMAGE USING THE SAME CODING MODE ACROSS COMPONENTS
A method, medium, and apparatus encoding and/or decoding an image in order to increase encoding and decoding efficiency by performing binary-arithmetic coding/decoding on a binary value of a syntax element using a probability model having the same syntax element probability value for respective context index information of each of at least two image components.
Latest Samsung Electronics Patents:
This application is a continuation of U.S. application Ser. No. 11/598,681, filed on Nov. 14, 2006, which claims the benefit of U.S. Provisional Patent Application No. 60/735,814, filed on Nov. 14, 2005, the benefit of Korean Patent Application No. 10-2006-0049079, filed on May 30, 2006 and the benefit of Korean Patent Application No. 10-2006-0110225, filed on May 30, 2006, in the Korean Intellectual Property Office, the disclosures of which are incorporated herein in their entirety by reference.
BACKGROUND OF THE INVENTION1. Field of the Invention
An embodiment of the present invention relates to a method, medium, and apparatus encoding and/or decoding an image.
2. Description of the Related Art
Generally, when an image is captured, the image is captured in a RGB format. However, when the captured image is compressed, the image is typically transformed to an image of a YUV or YCbCr format. In this case, Y is a luminance component, such as a black and white image, and U (or Cb) and V (or Cr) are chrominance components of the corresponding image. Information is typically evenly distributed to R, G, and B in an RGB image, whereas in a YUV (or YCbCr) image, a majority of the information flows into the Y component while a small amount of information is distributed to U (or Cb) and V (or Cr) components. Thus, when compression of an image is performed, compression efficiency of a YUV (or YCbCr) image is greater than that of an RGB image as two of the components include less information. In order to further increase the compression efficiency, a YUV (or YCbCr) 4:2:0 image is used, where the U (or Cb) and V (or Cr) components are sampled ¼ as many times as the luminance component Y.
In this YUV (or YCbCr) 4:2:0 image, since a statistical characteristic of the Y component is different from a statistical characteristic of the U (or Cb) or V (or Cr) component, when conventional image compression is performed, the Y component and the U (or Cb) and V (or Cr) components are processed using different encoding techniques. For example, according to recently standardized MPEG-4 AVC/H.264 standard technology of a Joint Video Team of ISO/IEC MPEG and ITU-T VCEG (“Text of ISO/IEC FDIS 14496-10: Information Technology—Coding of Audio-Visual Objects—Part 10: Advanced Video Coding”, ISO/IEC JTC 1/SC 29/WG 11, N5555, March, 2003) (hereinafter, called as MPEG-4 AVC/H.264), when a Y component of a video signal is encoded to an intra-image, i.e., based on information within the image, spatial prediction is performed using 9 prediction techniques according to directions predicted based on 4×4 blocks. In addition, spatial prediction is performed using 4 prediction techniques according to directions predicted based on 16×16 blocks. However, for the U (or Cb) and V (or Cr) components of the video signal, since their images are relatively simple compared to the Y component, spatial prediction independent to the Y component is performed using 4 prediction techniques based on their respective directions predicted based on 8×8 blocks.
When encoding to an intra-image is performed, i.e., based on information from other images, motion compensation of the Y component is finely performed by expanding predicted images using a 6-tap filter, whereas motion compensation of the U (or Cb) and V (or Cr) components is performed by expending predicted images using a bi-linear filter. In this way, according to such conventional systems, an image is compressed using different techniques between the luminance and chrominance components since the statistical characteristic of the Y component is different from the statistical characteristic of the U (or Cb) or V (or Cr) component.
In addition, even when a residue image, e.g., obtained through temporal-spatial prediction, is entropy encoded using a binary arithmetic coder, the residue image is compressed using a method in which different probability models are used for the respective components. However, the sampling of U (or Cb) and V (or Cr) of a YUV (or YCbCr) 4:2:0 image by ¼ of the sampling of the Y component is not suitable for high image quality applications due to generated color distortions. Thus, a method of effectively encoding a YUV (or YCbCr) 4:4:4 image, where such a U (or Cb) and V (or Cr) sampling process is unnecessary, has been found to be desirable. Accordingly, by directly encoding an RGB image, color distortions occurring in such a YUV (or YCbCr) transforming process can be avoided.
However, if an image, such as a YUV (or YCbCr) 4:4:4 image or an RGB image, in which image components have the same resolution, are directly encoded, if MPEG-4 AVC/H.264, as a conventional YUV (or YCbCr) 4:2:0 image compression method, is applied to the image encoding efficiency decreases. This is caused by the application of a method suitable for U (or Cb) and V (or Cr) components of a YUV (or YCbCr) 4:2:0 image to a YUV (or YCbCr) 4:4:4 image or an RGB image without any change. Accordingly, embodiments of the present invention overcome these drawbacks.
SUMMARY OF INVENTIONAn embodiment of the present invention provides an apparatus, medium, and method increasing encoding efficiency while retaining high image quality by performing spatial prediction and temporal prediction according to a statistical characteristic of an image when a YUV (or YCbCr) 4:4:4 image is encoded or an RGB image is encoded in an RGB domain without transforming the RGB image to the YUV (or YCbCr) domain.
Additional aspects and/or advantages of the invention will be set forth in part in the description which follows and, in part, will be apparent from the description, or may be learned by practice of the invention.
According to an aspect of the present invention, there is provided a method of generating a spatially predicted image, the method including generating a predicted image of a current image, including at least two image components, from pixels of a restored image spatially adjacent to a predetermined-sized block of the current image by applying a same predicted direction to each of the image components of the current image.
According to another aspect of the present invention, there is provided a medium including computer readable code to control at least one processing element to implement a method of generating a spatially predicted image, the method including generating a predicted image of a current image, including at least two image components, from pixels of a restored image spatially adjacent to a predetermined-sized block of the current image by applying a same predicted direction to each of the image components of the current image.
According to another aspect of the present invention, there is provided a method of generating a temporally predicted image, the method including generating a predicted image of a current image, including at least two image components, from motion estimation between a restored image and the current image by applying a same motion vector and a same motion interpolation method on a same block basis to each of the image components of the current image.
According to another aspect of the present invention, there is provided a medium including computer readable code to control at least one processing element to implement a method of generating a temporally predicted image, the method including generating a predicted image of a current image, including at least two image components, from motion estimation between a restored image and the current image by applying a same motion vector and a same motion interpolation method on a same block basis to each of the image components of the current image.
According to another aspect of the present invention, there is provided a method of generating a predicted image, the method including generating a spatially predicted image of a current image, including at least two image components, by applying a same predicted direction to each of the image components of the current image, generating a temporally predicted image of the current image by applying a same motion vector and a same motion interpolation method on a same block basis to each of the image components of the current image, selecting an encoding mode for the current image using the generated spatially predicted image and the generated temporally predicted image, and generating a predicted image of the current image by applying the selected encoding mode to each of the image components of the current image.
According to another aspect of the present invention, there is provided a medium including computer readable code to control at least one processing element to implement a method of generating a predicted image, the method including generating a spatially predicted image of a current image, including at least two image components, by applying a same predicted direction to each of the image components of the current image, generating a temporally predicted image of the current image by applying a same motion vector and a same motion interpolation method on the a same block basis to each of the image components of the current image, selecting an encoding mode for the current image using the generated spatially predicted image and the generated temporally predicted image, and generating a predicted image of the current image by applying the selected encoding mode to each of the image components of the current image.
According to another aspect of the present invention, there is provided an apparatus for generating a predicted image, the apparatus including a spatial prediction image generator to generate a spatially predicted image of a current image, including at least two image components, by applying a same predicted direction to each of the image components of the current image, a temporal prediction image generator to generate a temporally predicted image of the current image by applying a same motion vector and a same motion interpolation method on a same block basis to each of the image components of the current image, an encoding mode selector to select an encoding mode using the generated spatially predicted image and the generated temporally predicted image, and a single mode prediction image generator to generate a predicted image of the current image by applying the selected encoding mode to each of the image components of the current image.
According to another aspect of the present invention, there is provided a method of encoding an image, the method including generating a predicted image of a current image, including at least two image components, by applying a same encoding mode to each of the image components of the current image, generating a respective residue corresponding to a difference between the current image and the generated predicted image for each image component of the current image, and generating a bitstream by encoding the generated respective residues.
According to another aspect of the present invention, there is provided a medium including computer readable code to control at least one processing element to implement a method of encoding an image, the method including generating a predicted image of a current image, including at least two image components, by applying a same encoding mode to each of the image components of the current image, generating a respective residue corresponding to a difference between the current image and the generated predicted image for each image component of the current image, and generating a bitstream by encoding the generated respective residues.
According to another aspect of the present invention, there is provided a n apparatus for encoding an image, the apparatus including a prediction image generator to generate a predicted image of a current image, including at least two image components, by applying a same encoding mode to each of the image components of the current image, a residue generator to generate a respective residue corresponding to a difference between the current image and the generated predicted image for each image component of the current image, and an encoder to generate a bitstream by encoding the generated respective residues.
According to another aspect of the present invention, there is provided a method of decoding an image, the method including restoring respective residues for image components of a current image, which includes at least two image components, with the respective residues corresponding to a difference between the current image and a predicted image, and restoring the current image by adding the predicted image, generated by applying a same encoding mode to the restored respective residues, to the restored respective residues.
According to another aspect of the present invention, there is provided a medium including computer readable code to control at least one processing element to implement a method of decoding an image, the method including restoring respective residues for image components of a current image, which includes at least two image components, with the respective residues corresponding to a difference between the current image and a predicted image, and restoring the current image by adding the predicted image, generated by applying a same encoding mode to restored respective residues, to the restored respective residues.
According to another aspect of the present invention, there is provided an apparatus for decoding an image, the apparatus including a data restoration unit to restore respective residues for image components of a current image, which includes at least two image components, with the respective residues corresponding to a difference between the current image and a predicted image, and a prediction compensator to restore the current image by adding the predicted image, generated by applying a same encoding mode to restored respective residues, to the restored respective residues.
According to another aspect of the present invention, there is provided a context-based binary arithmetic coding method including binarizing respective syntax elements used to encode a respective residue, which correspond to least two image components and a difference between a current image and a predicted image, selecting a respective context index information of the respective syntax elements for each of the image components of the current image, and binary-arithmetic coding the respective syntax elements based on a same probability model having a same syntax element probability value for the selected respective context index information for each of the image components of the current image.
According to another aspect of the present invention, there is provided a medium including computer readable code to control at least one processing element to implement a context-based binary arithmetic coding method including binarizing respective syntax elements used to encode a respective residue, which correspond to least two image components and a difference between a current image and a predicted image, selecting respective context index information of the respective syntax elements for each of the image components of the current image, and binary-arithmetic coding the respective syntax elements using a same probability model having a same syntax element probability value for the selected respective context index information for each of the image components of the current image.
According to another aspect of the present invention, there is provided a context-based binary arithmetic coding apparatus including a binarization unit to binarize respective syntax elements used to encode a respective residue, which correspond to least two image components and a difference between a current image and a predicted image, a context index selector to select respective context index information of the respective syntax elements for each of the image components of the current image, and a binary arithmetic coder binary-arithmetic coding the respective syntax elements using a same probability model having a same syntax element probability value for the selected respective context index information of each of the image components of the current image.
According to another aspect of the present invention, there is provided a context-based binary arithmetic decoding method including selecting respective context index information of respective syntax elements used to encode a respective residue, which correspond to at least two image components and a difference between a current image and a predicted image, restoring respective binary values of the respective syntax elements by performing binary-arithmetic decoding on the respective binary values of the respective syntax elements using a same probability model having a same syntax element probability value for the selected respective context index information for each of the image components of the current image, and restoring the respective syntax elements by inverse-binarizing the restored respective binary values of the respective syntax elements.
According to another aspect of the present invention, there is provided a medium including computer readable code to control at least one processing element to implement a context-based binary arithmetic decoding method including selecting respective context index information of respective syntax elements used to encode a respective residue, which correspond to at least two image components and a difference between a current image and a predicted image, restoring respective binary values of the respective syntax element by binary-arithmetic decoding the respective binary values of the respective syntax element using a same probability model having a same syntax element probability value for the selected respective context index information of each of the image components of the current image, and restoring the respective syntax elements by inverse-binarizing the respective restored binary values of the respective syntax elements.
According to another aspect of the present invention, there is provided a context-based binary arithmetic decoding apparatus including a context index selector to select respective context index information of respective syntax elements used to encode a respective residue, which corresponds to at least two image components and a difference between a current image and a predicted image, a binary arithmetic decoder to restore respective binary values of the respective syntax elements by performing binary-arithmetic decoding the respective binary values of the respective syntax elements using a same probability model having a same syntax element probability value for the selected respective context index information of each of the image components of the current image, and an inverse binarization unit to restore the respective syntax elements by inverse-binarizing the restored respective binary values of the respective syntax elements.
These and/or other aspects and advantages of the invention will become apparent and more readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout. Embodiments are described below to explain the present invention by referring to the figures.
Herein, embodiments described below are related to encoding/decoding of a current image having of at least two image components by applying the same encoding mode to each of the image components. In particular, the current image can be either an RGB image or a YUV (or YCbCr) 4:4:4 image, for example. However, although a YUV image has been discussed below as a current image and in the attached drawings, it should be understood by those of ordinary skill in the art that the current image can be any of a YCbCr or RGB image, or any other format image.
Accordingly,
Referring to
In a spatial prediction mode, i.e., the intra-prediction mode, the spatial prediction image generator 100 generates a spatially predicted image of a current image, which includes at least two image components, from pixels of the restored image spatially adjacent to the predetermined-sized pixel block of the current image by applying the same predicted direction to each of the image components of the current image. For example, if the current image is an RGB image, the spatial prediction image generator 100 generates a spatially predicted image by applying the same predicted direction to each of an R component, a G component, and a B component. In particular, the spatial prediction image generator 100 generates a spatially predicted image for each of a plurality of encoding modes in the intra-prediction mode.
In this regard,
Referring to
Referring again to
Referring to
In particular, according to the current embodiment, the temporal prediction image generator 120 can use the 6-tap filter for all of an R component, a G component, and a B component in order to apply the same interpolation method to each of the R component, the G component, and the B component. Alternatively, the temporal prediction image generator 120 can use the bi-linear filter for all of the R component, the G component, and the B component. Furthermore, in an embodiment, each block can be encoded using an optimal method for that block and transmitted to an image decoding apparatus.
Referring back to
In this embodiment, the spatial bit amount/image quality distortion calculator 142 may calculate the bit amount and image quality distortion of the spatially predicted image, e.g., generated by the spatial prediction image generator 100. Similarly, the temporal bit amount/image quality distortion calculator 144 may calculate the bit amount and image quality distortion of the temporally predicted image, e.g., generated by the temporal prediction image generator 120. In particular, here, the spatial bit amount/image quality distortion calculator 142 calculates the bit amount and image quality distortion of the spatially predicted image generated by the spatial prediction image generator 100 in each of the encoding modes of the intra-prediction method, and the temporal bit amount/image quality distortion calculator 144 calculates the bit amount and image quality distortion of the temporally predicted image generated by the temporal prediction image generator 120 in each of the encoding modes of the inter-prediction method.
In more detail, the spatial bit amount/image quality distortion calculator 142 and the temporal bit amount/image quality distortion calculator 144 may calculate the bit amount of the spatially or temporally predicted image using a bitstream output from an entropy encoder, such as the entropy encoder 330 of an image encoding apparatus illustrated in
Here, D denotes a numeric value representing a degree of image quality distortion, p denotes a pixel value of a current image, q denotes a pixel value of a previous image, and i denotes an index of pixels in a current block.
The performance comparator 146 may sum the bit amount and the image quality distortion calculated by the spatial bit amount/image quality distortion calculator 142, for example, and sum the bit amount and the image quality distortion calculated by the temporal bit amount/image quality distortion calculator 144. The performance comparator 146 may further compare the summed values to each other and select the appropriate encoding mode corresponding to the smallest value, i.e., an encoding mode having the highest encoding efficiency, as a single encoding mode, as one example of determining the appropriate encoding mode. In more detail, in this embodiment, the performance comparator 146 sums the bit amount and the image quality distortion by multiplying a fixed constant in order to coordinate a unit of the bit amount and a unit of the image quality distortion, as represented by the below Equation 2.
L=D+λR Equation 2
Here, R denotes a bit rate, and λ denotes the fixed constant. That is, the performance comparator 146 may calculate L for a plurality of encoding modes and select an encoding mode corresponding to the smallest value of the calculated L's as the single encoding mode.
The single mode prediction image generator 160, thus, may generate a predicted image of the current image by evenly applying the single encoding mode selected by the encoding mode selector 140 to each of the image components of the current image.
Though, in this embodiment the encoding mode selector 140 selects an encoding mode having the highest encoding efficiency, an encoding mode having the highest encoding efficiency may alternately, or used in combination, be selected for other elements, depending on the situation.
Referring to
In operation 200, a spatially predicted image of a current image may be generated, with the current image including at least two image components, from pixels of a restored image spatially adjacent to a predetermined-sized pixel block of the current image by applying the same predicted direction to each of the image components of the current image.
In operation 220, a temporally predicted image of the current image may be generated from motion estimation of each of the image components between a restored image and the current image by applying the same motion vector and the same motion interpolation method on the same block basis for each of the image components.
In operation 240, the bit amount and image quality distortion of the spatially predicted image, e.g., generated in operation 200, and the bit amount and image quality distortion of the temporally predicted image, e.g., generated in operation 220, may be generated.
In operation 260, the bit amount and the image quality distortion calculated for the spatially predicted image, generated in operation 240, may be summed, the bit amount and the image quality distortion calculated for the temporally predicted image, generated in operation 240, may be summed, the summed values may be compared to each other, and an encoding mode corresponding to the smallest value may be selected as a single encoding mode to be used.
In operation 280, a predicted image of the current image may be generated by applying the single encoding mode, e.g., selected in operation 260, to each of the image components of the current image.
Referring to
The prediction image generator 300 may generate a predicted image of a current image, which includes at least two image components, by applying the same encoding mode to each of the image components. The prediction image generator 300 may use the predicted image generation apparatus illustrated in
The residue generator 310 may generate a residue corresponding to a difference between the current image and the predicted image generated by the prediction image generator 300. For example, if an input image is a YUV (or YCbCr) 4:4:4 image, when a spatial prediction mode is selected, the prediction image generator 300 applies the same prediction mode to all of a Y component, a U (or Cb) component, and a V (or Cr) component. When a temporal prediction mode is selected, the prediction image generator 300 applies the same motion vector on the same block basis to all of the Y component, the U (or Cb) component, and the V (or Cr) component, and when the predicted image is expanded, the prediction image generator 300 performs interpolation using the same filter for all of the Y component, the U (or Cb) component, and the V (or Cr) component. By performing the spatial or temporal prediction encoding according to the encoding mode selected as described above, for example, the residue generator 310 can generate a residue of each of the Y component, the U (or Cb) component, and the V (or Cr) component.
The transformation/quantization unit 320 may transform and quantize the residue generated by the residue generator 310 on predetermined-sized block basis. In more detail, the transformation/quantization unit 320 may perform the transformation using an orthogonal transform encoding method. Popularly used methods in the orthogonal transform encoding method are a Fast Fourier Transform (FFT) method, a Discrete Cosine Transform (DCT) method, a Karhunen Loeve Transform (KLT) method, a Hadamard transform method, and a slant transform method, for example.
In MPEG-4 AVC/H.264, each of a luminance component Y 1040 and chrominance components U (or Cb) 1050 and V (or Cr) 1060 may be transformed on 4×4 block basis, as illustrated in
Referring back to
In order to efficiently generate a bitstream in the entropy encoder 330 illustrated in
Referring back to
Referring to
In operation 400, a predicted image of a current image, which includes at least two image components, may be generated by applying the same encoding mode to each of the image components.
In operation 420, a residue corresponding to the difference between the current image and the predicted image generated may be generated, in operation 400.
In operation 440, the residue, generated in operation 420, may be transformed and quantized on predetermined-sized block basis.
In operation 460, a bitstream may be generated by entropy encoding the data transformed and quantized in operation 440.
Referring to
The entropy decoder 500 entropy-decodes a bitstream, with the dequantization/detransformation unit 520 restoring a residue of each of image components corresponding to a difference between a predicted image and a current image, which include at least two image components, by dequantizing and detransforming the result entropy-decoded by the entropy decoder 500 on predetermined-sized block basis. The prediction compensator 540 may, thus, restore the current image by adding a predicted image, generated by applying the same encoding mode to the residue of each of the image components which is restored by the dequantization/detransformation unit 520, to the restored residue.
Referring to
If the residue of each of the image components, e.g., restored by the dequantization/detransformation unit 520, has been encoded in the intra-prediction method, the spatial prediction compensator 700 may restore the current image by adding a predicted image, generated from pixels of a restored image spatially adjacent to a predetermined-sized block of the current image by applying the same predicted direction to each of the image components, to the residue of each of such image components restored by the dequantization/detransformation unit 520.
If the residue of each of the image components, e.g., restored by the dequantization/detransformation unit 520, has been encoded in the inter-prediction method, the temporal prediction compensator 750 may restore the current image by adding a predicted image, generated from motion estimation between a restored image and the current image by applying the same motion vector and the same motion interpolation method on the same block basis to each of the image components, to the residue of each of such image components restored by the dequantization/detransformation unit 520.
In general, the entropy encoding may be used to generate a bitstream by compressing a residue corresponding to the result transformed and quantized by the transformation/quantization unit 320 illustrated in
Referring to
In operation 600, a received bitstream is entropy-decoded, and then, in operation 620, the image decoding apparatus restores a residue of each of image components corresponding to a difference between a predicted image and a current image, which include at least two image components, by dequantizing and detransforming the result entropy-decoded in operation 600 on predetermined-sized block basis.
In operation 640, the current image may be restored by adding a predicted image, generated by applying the same encoding mode to the residue of each of the image components restored in operation 620, to the restored residue.
In general, in order to generate a bitstream by compressing data in an entropy encoding process, the data to be compressed is processed by partitioning the data into predetermined meaningful units. The predetermined meaningful units are called syntax elements. A unit of syntax element for arithmetic coding/decoding the transformed and quantized residues, generated by referring to
Here,
Referring to
In operation 1210, the image encoding apparatus may encode/decode syntax element coded_block_flag, indicating whether the residue restored by the transformation/quantization unit 320, for example, is 0 based on a transform block size. In general, a residue is transformed based on a transform block size of 4×4 or 8×8. In H.264/AVC, coded_block_flag is encoded based on the transform block size of 4×4, and whether all coefficients of a quantized residue of a transform block of 8×8 are 0 is indicated using coded_block_pattern, which is another syntax element. Thus, when transformation based on the transform block size of 8×8 is used, coded_block_pattern and coded_block_flag overlap each other, and thus coded_block_flag is not separately encoded.
In operation 1220, whether coded_block_flag is 1 is determined. If it is determined that coded_block_flag is 1, the process proceeds to operation 1230, and if coded_block_flag is not 1, the process may end. The fact that coded_block_flag is 1 indicates that non-zero coefficients exist in a 4×4 block corresponding to coded_block_flag.
In operation 1230, major map information is encoded/decoded, with the major map information indicating location information of the non-zero coefficients in the 4×4 block corresponding to coded_block_flag.
In operation 1240, level information of the non-zero coefficients is encoded/decoded.
Referring to
In operation 1234, whether significant_coeff_flag is 1 is determined. If it is determined that significant_coeff_flag is 1, the process proceeds to operation 1236, and if significant_coeff_flag is not 1, the process may end. The fact that significant_coeff_flag=1 indicates that the transformed and quantized residue in the 4×4 block corresponding to coded_block_flag is not 0.
In operation 1236, syntax element last_significant_coeff_flag may be encoded/decoded, with the last_significant_coeff_flag indicating whether data, of the case where the transformed and quantized residue in the 4×4 block corresponding to coded_block_flag is not 0, is finally non-zero data when the data is one-dimensionally scanned in the 4×4 block, as illustrated in
In operation 1238, whether last_significant_coeff_flag is 1 is determined. If last_significant_coeff_flag is 1, the process may end, and if last_significant_coeff_flag is not 1, the process may return to operation 1232. The fact that last_significant_coeff_flag is 1 indicates that the data is finally non-zero data when the data is one-dimensionally scanned in the 4×4 block.
Referring to
In operation 1244, syntax element coeff_abs_level_minus1 is encoded/decoded, with coeff_abs_level_minus1 being a level value of non-zero data in the 4×4 block corresponding to coded_block_flag.
In operation 1246, a sign value of the level value of non-zero data in the 4×4 block corresponding to coded_block_flag may be encoded/decoded.
Such a context-based binary arithmetic coding apparatus includes a binarization unit 1300, a context index selector 1310, a probability model storage unit 1330, and a binary arithmetic coder 1320. In one embodiment, the syntax elements include basic units for compressing image information in the entropy encoder 330 illustrated in
When a syntax element is not a binary value comprised of 0 or 1, the binarization unit 1300 binarizes the syntax element. In particular, the binarization unit 1300 can increase encoding efficiency by granting a long-length binary value to a low probability symbol and a short-length binary value to a high probability symbol, as in Variable Length Coding (VLC). In MPEG-4 AVC/H.264, for this binarization method, unary code, truncated unary code, fixed-length code, and a combination of truncated unary code and exponential Golomb code are used, as examples.
The unary code is obtained by binarizing a level value x to 1 and 0 (the total number of 1s and 0s is x). The truncated unary code uses a portion of the unary code, and when a used range is fixed from 0 to S, 1 is finally used for the S value without using 0. The exponential Golomb code is constituted of a prefix and a suffix, the prefix being a unary code of a l(x) value calculated by the below Equation 3.
l(x)=log2(x/2k+1) Equation 3
Here, x denotes a value to be binarized, k denotes an order of an exponential code. The suffix of the exponential Golomb code is a binary code of a m(x) value calculated by the below Equation 4, the binary code having the number of bits, i.e., k+l(x).
m(x)=x+2k(1−2l(x)) Equation 4
Here,
The context index selector 1310 selects probability model information of the syntax element as “context based,” which is a method of increasing compression efficiency by adaptively granting a probability value through extraction of a different context according to states of adjacent symbols when a provided binary symbol is encoded. That is, the probability model information includes two factors, a state and a Most Probability Symbol (MPS), adaptively changed according to selected context index information as described above, presenting information on a probability characteristic using the two factors. The two factors are stored in the probability model storage unit 1330.
The binary arithmetic coder 1320 searches a probability model of the syntax element using the context index information selected by the context index selector 1310 and encodes a binary value using the model information. The binary arithmetic coder 1320 also updates the probability model considering the encoded binary value after the encoding is performed.
In operation 1400, whether an input syntax element is a first syntax element of a specified unit is determined. If it is determined that the input syntax element is the first syntax element, the process goes to operation 1410, and if the input syntax element is not the first syntax element, the process goes to operation 1430, thereby omitting initialization to be performed in operations 1410 and 1420. Here, the specified unit may be a “slice” or “picture”.
In operation 1410, probability models of all syntax elements may be initialized.
In operation 1420, parameter values of the binary arithmetic coder 1320 may be initialized.
In operation 1430, the input syntax element may further be binarized.
In operation 1440, a context index of each of binary values, e.g., binarized in operation 1430, may be selected using adjacent context indexes. By doing this, a probability model can be more easily predicted, and thus the encoding efficiency can be increased.
In operation 1450, each of the binary values of the input syntax element may be binary-arithmetic coded using the probability model selected in operation 1440.
Referring to
In operation 1600, whether an input bitstream is a bitstream for restoring a first syntax element of the specified unit may be determined. If the input bitstream is a bitstream for restoring the first syntax element, the process may proceed to operation 1610, and if the input bitstream is not a bitstream for restoring the first syntax element, the process may proceed to operation 1630, thereby omitting example initialization performed in operations 1610 and 1620.
In operation 1610, probability models of all syntax elements may be initialized.
In operation 1620, parameter values of the binary arithmetic decoder 1510, e.g., such as illustrated in
In operation 1630, a context index of each of binary values of the syntax element may be selected using adjacent context indexes.
In operation 1640, the binary values of the syntax element may be restored by performing binary-arithmetic decoding on each of the binary values of the syntax element using the context index selected in operation 1630.
In operation 1650, the syntax element may be restored by inverse-binarizing the binary values restored in operation 1640.
Context-based binary arithmetic coding and decoding methods using a single probability model, according to an embodiment of the present invention, will now be described in greater detail by referring to the above description. In particular, a syntax element used by the context-based binary arithmetic coding apparatus illustrated in
Referring to
Each of the binarization units 1700 and 1740 may binarize a syntax element for encoding a residue corresponding to a difference between a predicted image and a current image including at least two image components. In particular, the binarization unit 1700 may binarize a syntax element for encoding a residue of the luminance component, and the binarization unit 1740 may binarize a syntax element for encoding a residue of the chrominance component.
Each of the context index selectors 1710 and 1750 may select context index information of a binary value of the syntax element. In particular, in this embodiment, the context index selector 1710 selects context index information of a binary value of the syntax element for encoding the residue of the luminance component, and the context index selector 1760 selects context index information of a binary value of the syntax element for encoding the residue of the chrominance component.
Each of the binary arithmetic coders 1720 and 1760 may binary-arithmetic code the binary value of the syntax element using a probability model having the same syntax element probability value for the context index value of the image component selected by the corresponding context index selector 1710 or 1750. In particular, in this embodiment, the binary arithmetic coder 1720 binary-arithmetic codes the binary value of the syntax element using a probability model having the same syntax element probability value for the context index value of the residue of the luminance component selected by the context index selector 1710, and the binary arithmetic coder 1760 binary-arithmetic codes the binary value of the syntax element using a probability model having the same syntax element probability value for the context index value of the residue of the chrominance component selected by the context index selector 1750.
In particular, according to this embodiment, each of the binary arithmetic coders 1720 and 1760 binary-arithmetic codes CBP information, which is a kind of syntax element for encoding the residue of the corresponding image component, using the same probability model for the image components. The CBP information indicates whether residue data transformed and quantized per a predetermined-sized block is all 0 for each of the at least two image components.
In more detail, the binary arithmetic coders 1720 and 1760 respectively binary-arithmetic code a first component of interest and a second component of interest among the image components using the same probability model. That is, in this embodiment, the binary arithmetic coders 1720 and 1760 respectively binary-arithmetic code CBP information indicating whether residue data transformed and quantized per predetermined-sized block is all 0 for the first component of interest among the image components and CBP information indicating whether residue data transformed and quantized per predetermined-sized block having the same phase is all 0 for the second component of interest among the image components, using the same probability model. For example, the first component of interest may be the luminance component, and the second component of interest may be a chrominance component. Alternatively, the binary arithmetic coders 1720 and 1760 respectively binary-arithmetic code CBP information indicating whether residue data transformed and quantized per predetermined-sized block having the same phase for the first component of interest is all 0, using the same probability model.
The probability model storage unit 1730 stores the probability model having the same syntax element probability value for the context index information of each of the at least two image components.
Referring to
In operation 1800, whether an input syntax element is a first syntax element of the specified unit may be determined. If the input syntax element is the first syntax element, the may proceed to operation 1810, and if the input syntax element is not the first syntax element, the process may proceed to operation 1830, thereby omitting initializations to be performed in operations 1810 and 1820.
In operation 1810, probability models of all syntax elements may be initialized.
In operation 1820, parameter values of the binary arithmetic coders 1720 and 1760 may be initialized.
In operation 1830, a syntax element of each of image components may be initialized.
In operation 1840, a context index of each of the binary values binarized in operation 1830 may be selected using adjacent context indexes.
In operation 1850, the binary values of the syntax elements may be binary-arithmetic coded, using the probability model having the same syntax element probability value for a context index value of each of the image components, which has been selected in operation 1440.
Referring to
Each of the context index selectors 1900 and 1940 selects context index information of a binary value of a syntax element for encoding a residue corresponding to a difference between a predicted image and a current image comprised of at least two image components.
Each of the binary arithmetic decoders 1910 and 1950 restores the binary value of the syntax element by performing binary-arithmetic decoding on the binary value of the syntax element using a probability model having the same syntax element probability value for the context index information of the image component selected by the corresponding context index selector 1900 or 1940. In particular, according to this embodiment, each of the binary arithmetic decoders 1910 and 1950 may perform binary-arithmetic decoding on CBP information, which is a kind of syntax element for decoding the residue of the corresponding image component, using the same probability model for the image components. The CBP information indicates whether residue data transformed and quantized per a predetermined-sized block is all 0's for each of the image components.
In more detail, the binary arithmetic decoders 1910 and 1950 respectively perform binary-arithmetic decoding on a first component of interest and a second component of interest among the image components using the same probability model. For example, the binary arithmetic decoders 1910 and 1950 respectively binary-arithmetic decode CBP information indicating whether residue data transformed and quantized per predetermined-sized block is all 0's for the first component of interest among the image components and CBP information indicating whether residue data transformed and quantized per predetermined-sized block having the same phase is all 0's for the second component of interest among the image components, using the same probability model. Alternatively, the binary arithmetic decoders 1910 and 1950 may respectively binary-arithmetic decode CBP information indicating whether residue data transformed and quantized per predetermined-sized block having the same phase for each of the image components is all 0's, using the same probability model, for example.
The inverse binarization units 1920 and 1960 restore the syntax elements by inverse-binarizing the binary values of the syntax elements restored by the binary arithmetic decoders 1910 and 1950.
The probability model storage unit 1930 stores the probability model having the same syntax element probability value for the context index information of each of the at least two residue components.
Referring to
In operation 2000, whether an input bitstream is a bitstream for restoring a first syntax element of the specified unit may be determined. If it is determined that the input bitstream is a bitstream for restoring the first syntax element, the process may proceed to operation 2010, and if the input bitstream is not a bitstream for restoring the first syntax element, the process may proceed to operation 2030 by omitting initializations performed in operations 2010 and 2020.
In operation 2010, probability models of all syntax elements may be initialized.
In operation 2020, parameter values of the binary arithmetic decoders 1910 and 1950 illustrated in
In operation 2030, a context index of a binary value of a syntax element each of image components may be selected using adjacent context indexes.
In operation 2040, the binary values of the syntax elements may be restored by performing binary-arithmetic decoding on the binary values of the syntax elements, using the probability model having the same syntax element probability value for context index information of each of the image components, which has been selected in operation 2030.
In operation 2050, the syntax elements may be restored by inverse-binarizing the binary values restored in operation 2040.
ctxIdx=cxtIdxOffset+ctxIdxInc Equation 5
Here, in general, ctxIdxInc may be obtained using context index information obtained from spatially adjacent blocks in up and left directions in order to obtain context index information of a current block as illustrated in
As illustrated in
Here, an initial state value is determined by sliceQP, which is a QP value of a slice, m, and n and has a range of [0, 63]. If a state value is close to 0, a probability of an MPS of a binary value of a syntax element is close to ½, and if the state value is close to 63, the probability is close to 1. As an example, values of m and n indicating the same probability model used for syntax elements coded_block_pattern of luminance and chrominance components, which may be stored in the probability model storage units 1730 and 1930 illustrated in
A method of selecting context index information in the context index selectors 1710, 1750, 1900, and 1940 illustrated in
For syntax elements coded_block_flag, significant_coeff_flag, last_significant_coeff_flag, and coeff_abs_level_minus1, instead of coded_block_pattern using Equation 5 or another method of obtaining context index information of a syntax element, context index information may be obtained by the below Equation 7, for example.
ctxIdx=ctxIdxOffset+ctxBlockCatOffset(ctxBlockCat)+ctxIdxInc Equation 7
Here, ctxBlockCat is a value to differently use context index information for each of block types, e.g., according to the encoding modes illustrated in
The item ctxBlockCatOffset(ctxBlockCat) of Equation 7 indicates a starting value of context index information corresponding to each block type when ctxBlockCat is defined, and is illustrated in
For syntax elements coded_block_flag, significant_coeff_flag, last_significant_coeff_flag, and coeff_abs_level_minus1, which are syntax elements of image components, as illustrated in
Among syntax elements for encoding a residue, coeff_abs_level_minus1 will now be described in greater detail. If a value to be currently encoded is a binary number at a first location of binarized data, ctxIdxInc may be selected by the below Equation 8, for example.
ctxIdxInc=((numDecodAbsLevelGt1!=0)?0:Min(N,1+numDecodAbsLevelEq1)) Equation 8
Here, ctxIdxInc denotes a value designating a selected context index, numDecodAbsLevelGt1 denotes the number of quantization transform coefficient values previously decoded greater than 1, numDecodAbsLevelEq1 denotes the number of quantization transform coefficient values previously decoded equal to 1, N denotes the number of context index values for the binary number at the first location, “!=” denotes “not equal”, “?” denotes “if ˜ then”, “:” denotes “else”. If the value to be currently encoded is not the binary number at the first location, ctxIdxInc may be selected by the below Equation 9, for example.
ctxIdxInc=N+Min(M,numDecodAbsLevelGt1) Equation 9
Here, M denotes the number of context index values for binary numbers that are not placed at the first location. The context-based binary arithmetic coding apparatus, e.g., illustrated in
In
The syntax element code_block_pattern may contain all of CBP information of the luminance and chrominance components and can be represented by the below Equation 10, for example.
CodedBlockPatternLuma=coded_block_pattern % 16;
CodedBlockPatternChroma=coded_block_pattern/16;
CodedBlockPatternLuma=CBP(3)<<3+CBP(2)<<2+CBP(1)<<1+CBP(0);
CodedBlockPatternChroma=CodedBlockPatternChroma444[0]<<4+CodedBlockPatternChroma444[1];
CodedBlockPatternChroma444[0]=CBP(7)<<3+CBP(6)<<2+CBP(5)<<1+CBP(4)
CodedBlockPatternChroma444[1]=CBP(11)<<3+CBP(10)<<2+CBP(9)<<1+CBP(8)
CodedBlockPatternChroma444[iCbCr] (iCbCr==0, or 1) Equation 10
Here, according to this embodiment, CodedBlockPatternLuma contains CBP information of the luminance component, CodedBlockPatternChroma contains CBP information of the chrominance component, and CodedBlockPatternChroma includes CodedBlockPatternChroma444[iCbCr] (the U component if iCbCr==0, the V component if iCbCr==1), which is CBP information of the chrominance component U (or Cb) or V (or Cr).
For the encoding modes excluding the 116×16 encoding mode, a detailed meaning of CBP information of the syntax element code_block_pattern for a luminance component of each predetermined-sized block, according to an embodiment of the present invention, is illustrated in
For the encoding modes excluding the 116×16 encoding mode in a 4:4:4 image, a detailed meaning of CBP information, according to an embodiment of the present invention, of the syntax element code_block_pattern for a chrominance component of each predetermined-sized block is illustrated in
In
The syntax element code_block_pattern may contain all of CBP information of the luminance and chrominance components and can be represented by the following Equation 11, for example.
CodedBlockPatternLuma=coded_block_pattern % 16;
CodedBlockPatternChroma=coded_block_pattern/16;
CodedBlockPatternLuma=CBP(3)<<3+CBP(2)<<2+CBP(1)<<1+CBP(0);
CodedBlockPatternChroma=CBP(7)<<3+CBP(6)<<2+CBP(5)<<1+CBP(4); Equation 11
Here, CodedBlockPatternLuma contains CBP information of the luminance component, CodedBlockPatternChroma contains CBP information of the chrominance component.
For the encoding modes excluding the 116×16 encoding mode, the detailed meaning of CBP information of the syntax element code_block_pattern for a luminance component of each predetermined-sized block is illustrated in
For the encoding modes excluding the 116×16 encoding mode in a 4:4:4 image, the detailed meaning of CBP information of the syntax element code_block_pattern for a chrominance component of each predetermined-sized block is illustrated in
In
The syntax element code_block_pattern contains all of CBP information of the luminance and chrominance components and can be represented by the below Equation 12, for example.
CodedBlockPatternLuma=coded_block_pattern;
CodedBlockPatternLuma=CBP(3)<<3+CBP(2)<<2+CBP(1)<<1+CBP(0); Equation 12
Here, CodedBlockPatternLuma contains CBP information of the same phase of the luminance and chrominance components.
For the encoding modes excluding the 116×16 encoding mode, the detailed meaning of CBP information of the syntax element code_block_pattern for luminance and chrominance components of each predetermined-sized block is illustrated in
In addition to the above described embodiments, embodiments of the present invention can also be implemented through computer readable code/instructions in/on a medium, e.g., a computer readable medium, to control at least one processing element to implement any above described embodiment. The medium can correspond to any medium/media permitting the storing and/or transmission of the computer readable code.
The computer readable code can be recorded/transferred on a medium in a variety of ways, with examples of the medium including magnetic storage media (e.g., ROM, floppy disks, hard disks, etc.), optical recording media (e.g., CD-ROMs, or DVDs), and storage/transmission media such as carrier waves, as well as through the Internet, for example. Here, the medium may further be a signal, such as a resultant signal or bitstream, according to embodiments of the present invention. The media may also be a distributed network, so that the computer readable code is stored/transferred and executed in a distributed fashion. Still further, as only a example, the processing element could include a processor or a computer processor, and processing elements may be distributed and/or included in a single device.
As described above, according to an embodiment of the present invention, when prediction encoding of each component of a color image is performed, using spatially or temporally adjacent pixels, encoding efficiency can be increased by applying the same prediction method to each image component. For example, in the case where each component has the same resolution when, a YUV (or YCbCr) color image is used, encoding efficiency can be increased by using the same prediction method as the Y component without sampling chrominance components U (or Cb) and V (or Cr) by the conventional ¼ sampling
In addition, according to an embodiment of the present invention, when an RGB color image is used, when a current RGB image is encoded in an RGB domain without being transformed to a YUV (or YCbCr) image, encoding efficiency can be increased with maintenance of high image quality by performing spatial prediction and temporal prediction of image components according to a statistical characteristic of the RGB image. In addition, by providing a context-based binary arithmetic coding method using a single probability model and a method of encoding CBP information on predetermined-sized block basis, in which entropy encoding/decoding is performed using a single probability model for image components with respect to a residue obtained using an encoding method, encoding efficiency can be increased without an increase in complexity.
In addition, by effectively compressing RGB video images, which can be directly acquired from a device, without transforming to YUV (or YCbCr) images, a loss of image quality, such as distortion of colors occurring with transforms to the YUV (or YCbCr) domain is performed, can be prevented due to the direct encoding in the RGB domain, and thus it is suitable to be applied to digital cinema and digital archive requiring high-quality image information.
Although a few embodiments of the present invention have been shown and described, it would be appreciated by those skilled in the art that changes may be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the claims and their equivalents.
Claims
1. An apparatus for decoding a predicted image of a current image, the apparatus comprising:
- a processor to obtain a predicted direction of luma component of the current image and a predicted direction of chroma components corresponding to the predicted direction of luma component in one of predetermined decoding modes; and
- a decoder to decode the predicted image of the current image based on the predicted direction of the luma and chroma components,
- wherein the predicted direction of chroma components is identical to the predicted direction of the luma component.
Type: Application
Filed: Oct 20, 2014
Publication Date: May 11, 2017
Patent Grant number: 9674541
Applicant: Samsung Electronics Co., Ltd. (Suwon-si)
Inventors: Dae-sung CHO (Yongin-si), Hyun-mun Kim (Yongin-si), Woo-shik Kim (Yongin-si), Dmitri Birinov (Yongin-si)
Application Number: 14/518,205