Coding unit quantization parameters in video coding
A method is provided that includes receiving a coded largest coding unit in a video decoder, wherein the coded largest coding unit includes a coded coding unit structure and a plurality of coded quantization parameters, and decoding the coded largest coding unit based on the coded coding unit structure and the plurality of coded quantization parameters.
Latest TEXAS INSTRUMENTS INCORPORATED Patents:
This application is a continuation of application Ser. No. 14/531,632, filed Nov. 3, 2014, which is a continuation of application Ser. No. 13/093,715, filed Apr. 25, 2011, which claims benefit of U.S. Provisional Patent Application No. 61/331,216, filed May 4, 2010, of U.S. Provisional Patent Application No. 61/431,889, filed Jan. 12, 2011, and of U.S. Provisional Patent Application No. 61/469,518, filed Mar. 30, 2011, all of which are incorporated herein by reference in their entirety.
BACKGROUND OF THE INVENTIONThe demand for digital video products continues to increase. Some examples of applications for digital video include video communication, security and surveillance, industrial automation, and entertainment (e.g., DV, HDTV, satellite TV, set-top boxes, Internet video streaming, video gaming devices, digital cameras, cellular telephones, video jukeboxes, high-end displays and personal video recorders). Further, video applications are becoming increasingly mobile as a result of higher computation power in handsets, advances in battery technology, and high-speed wireless connectivity.
Video compression, i.e., video coding, is an essential enabler for digital video products as it enables the storage and transmission of digital video. In general, current video coding standards define video compression techniques that apply prediction, transformation, quantization, and entropy coding to sequential blocks of pixels, i.e., macroblocks, in a video sequence to compress, i.e., encode, the video sequence. A macroblock is defined as a 16×16 rectangular block of pixels in a frame or slice of a video sequence where a frame is defined to be a complete image captured during a known time interval.
A quantization parameter (QP) may be used to modulate the step size of the quantization for each macroblock. For example, in H.264/AVC, quantization of a transform coefficient involves dividing the coefficient by a quantization step size. The quantization step size, which may also be referred to as the quantization scale, is defined by the standard based on the QP value, which may be an integer within some range 0 . . . 51. A step size for a QP value may be determined, for example, using a table lookup and/or by computational derivation.
The quality and bit rate of the compressed bit stream is largely determined by the QP value selected for quantizing each macroblock. That is, the quantization step size (Qs) used to quantize a macroblock regulates how much spatial detail is retained in a compressed macroblock. The smaller the Qs, the more retention of detail and the better the quality but at the cost of a higher bit rate. As the Qs increases, less detail is retained and the bit rate decreases but at the cost of increased distortion and loss of quality.
Particular embodiments will now be described, by way of example only, and with reference to the accompanying drawings:
Specific embodiments of the invention will now be described in detail with reference to the accompanying figures. Like elements in the various figures are denoted by like reference numerals for consistency.
As was previously discussed, in current video coding standards such as H.264/AVC, the coding operations of prediction, transformation, quantization, and entropy coding are performed based on fixed size 16×16 blocks referred to as macroblocks. Further, a quantization parameter is generated for each macroblock with no provision for doing so for larger or smaller blocks. For larger frame sizes, e.g., frame sizes used for high definition video, using a larger block size for the block-based coding operations may provide better coding efficiency and/or reduce data transmission overhead. For example, a video sequence with a 1280×720 frame size and a frame rate of 60 frames per second is 36 times larger and 4 times faster than a video sequence with a 176×144 frame size and a frame rate of 15 frames per second. A block size larger than 16×16 would allow a video encoder to take advantage of the increased spatial and/or temporal redundancy in the former video sequence. Such larger block sizes are currently proposed in the emerging next generation video standard referred to High Efficiency Video Coding (HEVC). HEVC is the proposed successor to H.264/MPEG-4 AVC (Advanced Video Coding), currently under development by a Joint Collaborative Team on Video Coding (JCT-VC) established by the ISO/IEC Moving Picture Experts Group (MPEG) and ITU-T Video Coding Experts Group (VCEG).
However, an increased block size may adversely affect rate control. That is, many rate control techniques manage QP on a block-by-block basis according to the available space in a hypothetical transmission buffer. Increasing the block size reduces the granularity at which rate control can adjust the value of QP, thus possibly making rate control more difficult and/or adversely affecting quality. Further, reducing the granularity at which QP can change by increasing the block size impacts the visual quality performance of perceptual rate control techniques that adapt the QP based on the activity in a block.
Embodiments described herein provide for block-based video coding with a large block size, e.g., larger than 16×16, in which multiple quantization parameters for a single block may be generated. More specifically, a picture (or slice) is divided into non-over-lapping blocks of pixels referred to as largest coding units (LCU). As used herein, the term “picture” refers to a frame or a field of a frame. A frame is a complete image captured during a known time interval. A slice is a subset of sequential LCUs in a picture. An LCU is the base unit used for block-based coding. That is, an LCU plays a similar role in coding as the prior art macroblock, but it may be larger, e.g., 32×32, 64×64, 128×128, etc. For purposes of quantization, the LCU is the largest unit in a picture for which a quantization parameter (QP) may be generated.
As part of the coding process, various criteria, e.g., rate control criteria, complexity considerations, rate distortion constraints, etc., may be applied to partition an LCU into coding units (CU). A CU is a block of pixels within an LCU and the CUs within an LCU may be of different sizes. After the CU partitioning. i.e., the CU structure, is identified, a QP is generated for each CU. Block-based coding is then applied to the LCU to code the CUs. As part of the coding, the QPs are used in the quantization of the corresponding CUs. The CU structure and the QPs are also coded for communication, i.e., signaling, to a decoder.
In some embodiments, QP values are communicated to a decoder in a compressed bit stream as delta QP values. Techniques for computing the delta QPs and for controlling the spatial granularity at which delta QPs are signaled are also provided. In some embodiments, more than one technique for computing the delta QP values may be used in coding a single video sequence. In such embodiments, the technique used may be signaled in a compressed bit stream at the appropriate level, e.g., sequence, picture, slice, and/or LCU.
The video encoder component 106 receives a video sequence from the video capture component 104 and encodes it for transmission by the transmitter component 108. The video encoder component 106 receives the video sequence from the video capture component 104 as a sequence of frames, divides the frames into LCUs, and encodes the video data in the LCUs. The video encoder component 106 may be configured to apply one or more techniques for generating and encoding multiple quantization parameters for an LCU during the encoding process as described herein. Embodiments of the video encoder component 106 are described in more detail below in reference to
The transmitter component 108 transmits the encoded video data to the destination digital system 102 via the communication channel 116. The communication channel 116 may be any communication medium, or combination of communication media suitable for transmission of the encoded video sequence, such as, for example, wired or wireless communication media, a local area network, or a wide area network.
The destination digital system 102 includes a receiver component 110, a video decoder component 112 and a display component 114. The receiver component 110 receives the encoded video data from the source digital system 100 via the communication channel 116 and provides the encoded video data to the video decoder component 112 for decoding. The video decoder component 112 reverses the encoding process performed by the video encoder component 106 to reconstruct the LCUs of the video sequence. The video decoder component may be configured to apply one or more techniques for decoding multiple quantization parameters for an LCU during the decoding process as described herein. Embodiments of the video decoder component 112 are described in more detail below in reference to
The reconstructed video sequence is displayed on the display component 114. The display component 114 may be any suitable display device such as, for example, a plasma display, a liquid crystal display (LCD), a light emitting diode (LED) display, etc.
In some embodiments, the source digital system 100 may also include a receiver component and a video decoder component and/or the destination digital system 102 may include a transmitter component and a video encoder component for transmission of video sequences both directions for video steaming, video broadcasting, and video telephony. Further, the video encoder component 106 and the video decoder component 112 may perform encoding and decoding in accordance with one or more video compression standards. The video encoder component 106 and the video decoder component 112 may be implemented in any suitable combination of software, firmware, and hardware, such as, for example, one or more digital signal processors (DSPs), microprocessors, discrete logic, application specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), etc.
As was previously mentioned, an LCU may be partitioned into coding units (CU) during the coding process. For simplicity of explanation in describing embodiments, a recursive quadtree structure is assumed for partitioning of LCUs into CUs. One of ordinary skill in the art will understand embodiments in which other partitioning structures are used. In the recursive quadtree structure, a CU may be square. Accordingly, an LCU is also square. A picture is divided into non-overlapped LCUs. Given that a CU is square, the CU structure within an LCU can be a recursive quadtree structure adapted to the frame. That is, each time a CU (or LCU) is partitioned, it is divided into four equal-sized square blocks. Further, a given CU can be characterized by the size of the LCU and the hierarchical depth of the LCU where the CU occurs. The maximum hierarchical depth is determined by the size of the smallest CU (SCU) permitted.
As shown in
The rate control component 344 receives an LCU from the coding control component 340 and applies various criteria to the LCU to determine one or more QPs to be used by the LCU processing component 342 in coding the LCU. More specifically, the rate control component 344 partitions the LCU into CUs of various sizes within the recursive quadtree structure based on the various criteria to determine the granularity at which QPs should be applied and then computes a QP for each CU that is not further subdivided, i.e., for each coding unit that is a leaf node in the quadtree. The CU structure of the LCU and the QPs are provided to the coding control component 340.
The QPs applied to an LCU during the coding of the LCU will be signaled in the compressed bit stream. To minimize the amount of information signaled in the compressed bits stream, it may be desirable to constrain the granularity at which QPs may be applied in an LCU. Recall that the SCU size sets the size of the smallest CU in the recursive quadtree structure. In some embodiments, a minimum QP CU size may be specified in addition to the LCU and SCU sizes. In such embodiments, the smallest CU that the rate control component 344 can use in partitioning an LCU is limited by the minimum QP CU size rather than the SCU size. Thus, the minimum QP CU size may be set to sizes larger than the SCU to constrain the granularity at which QPs may be applied. For example, if the LCU is assumed to be 64×64 and the SCU is assumed to be 8×8, the four possible CU sizes allowed in the recursive quadtree structure are 64×64, 32×32, 16×6, and 8×8. Without the minimum QP CU size constraint, the rate control component 344 can generate QPs for CUs as small as 8×8. However, if a minimum QP CU size of 16×16 is specified, the rate control component 344 can generate QPs for CUs as small as 16×16 but no smaller. The minimum QP CU size may be set at the sequence, picture, slice, and/or LCU level and signaled in the compressed bit stream accordingly.
Referring again to
For example, assume an image in which the top half is sky and the bottom half is trees. In top of the image, most of the region is totally flat, so a low QP value should be used. It may be possible to use one QP value for an entire LCU in that part of the image as an LCU may be only sky. In the bottom half of the image, most of the region is busy, so a higher QP value can be used. Further, it may be possible to use one QP value for an entire LCU in that region, as an LCU may have only trees.
However, there will be transition regions in which LCUs will have both sky and trees. In such LCUs, there may be regions that are sky and regions that are trees. Such an LCU may be partitioned into CUs sized based on activity (within the limits of the quadtree coding structure). For example, an LCU may be divided into four CUs A, B, C, and D, and the activity level in areas of each CU may then analyzed. If a CU, say CU A, has regions of widely varying activity levels, then CU A may be further divided into four CUs, A1, A2, A3, and A4 in an attempt to reduce the variance in activity level over the area where a QP will be applied. These four CUs may each also be further divided into four CUs based on activity. Once the CU partitioning is complete, QP values may then be computed for each CU.
The coding control component 340 provides information regarding the initial LCU CU structure and the QPs determined by the rate control component 344 to the various components of the LCU processing component 342 as needed. For example, the coding control component may provide the LCU and SCU size to the entropy encoder component 340 for inclusion in the compressed video stream at the appropriate point. In another example, the coding control component 340 may generate a quantization parameter array for use by the quantize component 306 and store the quantization parameter array in the memory 346. The size of the quantization parameter array may be determined based on the maximum possible number of CUs in an LCU. For example, assume the size of the SCU is 8×8 and the size of the LCU is 64×64. Thus, the maximum possible number of CUs in an LCU is 64. The quantization parameter array is sized to hold a QP for each of these 64 possible coding units, i.e., is an 8×8 array. The QPs computed by the rate control component 344 are mapped into this array based on the CU structure. As is explained in more detail herein in reference to the quantize component 306, a QP for any size CU in the LCU may be located in this array using the coordinates of the upper left hand corner of the CU in the LCU.
Referring again to
The storage component 318 provides reference data to the motion estimation component 320 and to the motion compensation component 322. The reference data may include one or more previously encoded and decoded CUs, i.e., reconstructed CUs.
The motion estimation component 320 provides motion estimation information to the motion compensation component 322 and the entropy encoder 334. More specifically, the motion estimation component 320 performs tests on CUs in an LCU based on multiple temporal prediction modes using reference data from storage 318 to choose the best motion vector(s)/prediction mode based on a coding cost. To perform the tests, the motion estimation component 320 may begin with the CU structure provided by the coding control component 340. The motion estimation component 320 may divide each CU indicated in the CU structure into prediction units according to the unit sizes of prediction modes and calculate the coding costs for each prediction mode for each CU.
For coding efficiency, the motion estimation component 320 may also decide to alter the CU structure by further partitioning one or more of the CUs in the CU structure. That is, when choosing the best motion vectors/prediction modes, in addition to testing with the initial CU structure, the motion estimation component 320 may also choose to divide the larger CUs in the initial CU structure into smaller CUs (within the limits of the recursive quadtree structure), and calculate coding costs at lower levels in the coding hierarchy. As will be explained below in reference to the quantizer component 306, any changes made to the CU structure do not affect how the QPs computed by the rate control component 344 are applied. If the motion estimation component 320 changes the initial CU structure, the modified CU structure is communicated to other components in the LCU processing component 342 that need the information.
The motion estimation component 320 provides the selected motion vector (MV) or vectors and the selected prediction mode for each inter predicted CU to the motion compensation component 323 and the selected motion vector (MV) to the entropy encoder 334. The motion compensation component 322 provides motion compensated inter prediction information to a selector switch 326 that includes motion compensated inter predicted CUs and the selected temporal prediction modes for the inter predicted CUs. The coding costs of the inter predicted CUs are also provided to the mode selector component (not shown).
The intra prediction component 324 provides intra prediction information to the selector switch 326 that includes intra predicted CUs and the corresponding spatial prediction modes. That is, the intra prediction component 324 performs spatial prediction in which tests based on multiple spatial prediction modes are performed on CUs in an LCU using previously encoded neighboring CUs of the picture from the buffer 328 to choose the best spatial prediction mode for generating an intra predicted CU based on a coding cost. To perform the tests, the intra prediction component 324 may begin with the CU structure provided by the coding control component 340. The intra prediction component 324 may divide each CU indicated in the CU structure into prediction units according to the unit sizes of the spatial prediction modes and calculate the coding costs for each prediction mode for each CU.
For coding efficiency, the intra prediction component 324 may also decide to alter the CU structure by further partitioning one or more of the CUs in the CU structure. That is, when choosing the best prediction modes, in addition to testing with the initial CU structure, the intra prediction component 324 may also chose to divide the larger CUs in the initial CU structure into smaller CUs (within the limits of the recursive quadtree structure), and calculate coding costs at lower levels in the coding hierarchy. As will be explained below in reference to the quantizer component 306, any changes made to the CU structure do not affect how the QP values computed by the rate control component 344 are applied. If the intra prediction component 324 changes the initial CU structure, the modified CU structure is communicated to other components in the LCU processing component 342 that need the information. Although not specifically shown, the spatial prediction mode of each intra predicted CU provided to the selector switch 326 is also provided to the transform component 304. Further, the coding costs of the intra predicted CUs are also provided to the mode selector component.
The selector switch 326 selects between the motion-compensated inter predicted CUs from the motion compensation component 322 and the intra predicted CUs from the intra prediction component 324 based on the difference metrics of the CUs and the picture prediction mode provided by the mode selector component. The output of the selector switch 326, i.e., the predicted CU, is provided to a negative input of the combiner 302 and to a delay component 330. The output of the delay component 330 is provided to another combiner (i.e., an adder) 338. The combiner 302 subtracts the predicted CU from the current CU to provide a residual CU to the transform component 304. The resulting residual CU is a set of pixel difference values that quantify differences between pixel values of the original CU and the predicted CU.
The transform component 304 performs unit transforms on the residual CUs to convert the residual pixel values to transform coefficients and provides the transform coefficients to a quantize component 306. The quantize component 306 determines a QP for the transform coefficients of a residual CU and quantizes the transform coefficients based on that QP. For example, the quantize component 306 may divide the values of the transform coefficients by a quantization scale (Qs) derived from the QP value. In some embodiments, the quantize component 306 represents the coefficients by using a desired number of quantization steps, the number of steps used (or correspondingly the value of Qs) determining the number of bits used to represent the residuals. Other algorithms for quantization such as rate-distortion optimized quantization may also be used by the quantize component 306.
The quantize component 306 determines a QP for the residual CU transform coefficients based on the initial CU structure provided by the coding control component 340. That is, if the residual CU corresponds to a CU in the initial CU structure, then the quantize component 306 uses the QP computed for that CU by the rate control component 344. For example, referring to the example of
If the residual CU corresponds to a CU created during the prediction processing, then the quantize component 306 uses the QP of the original CU that was subdivided during the prediction processing to create the CU as the QP for the residual CU. For example, if CU C of
As was previously mentioned, the coding control component 340 may generate a quantization parameter array that is stored in the memory 346. The quantize component 306 may use this matrix to determine a QP for the residual CU coefficients. That is, the coordinates of the upper left corner of the CU corresponding to the residual CU, whether that CU is in the original coding structure or was added during the prediction process, may be used to locate the appropriate QP in the quantization parameter array. In general, the x coordinate may be divided by the width of the SCU and the y coordinate may be divided by the height of the SCU to compute the coordinates of the appropriate QP in the quantization parameter array.
For example, consider the CU structure 500 and the quantization parameter array 502 of
Because the DCT transform redistributes the energy of the residual signal into the frequency domain, the quantized transform coefficients are taken out of their scan ordering by a scan component 308 and arranged by significance, such as, for example, beginning with the more significant coefficients followed by the less significant. The ordered quantized transform coefficients for a CU provided via the scan component 308 along with header information for the CU are coded by the entropy encoder 334, which provides a compressed bit stream to a video buffer 336 for transmission or storage. The entropy coding performed by the entropy encoder 334 may be use any suitable entropy encoding technique, such as, for example, context adaptive variable length coding (CAVLC), context adaptive binary arithmetic coding (CABAC), run length coding, etc.
The entropy encoder 334 encodes information regarding the CU structure used to generate the coded CUs in the compressed bit stream and information indicating the QPs used in the quantization of the coded CUs. In some embodiments, the CU structure of an LCU is signaled to a decoder by encoding the sizes of the LCU and the SCU and a series of split flags in the compressed bit stream. If a CU in the recursive quadtree structure defined by the LCU and the SCU is split, i.e., partitioned, in the CU structure, a split flag with a value indicating a split, e.g., 1, is signaled in the compressed bit stream. If a CU is not split and the size of the CU is larger than that of the SCU, a split flag with a value indicating no split, e.g., 0, is signaled in the compressed bit stream. Information specific to the unsplit CU will follow the split flag in the bit stream. The information specific to a CU may include CU header information (prediction mode, motion vector differences, coding block flag information, etc), QP information, and coefficient information. Coefficient information may not be included if all of the CU coefficients are zero. Further, if the size of a CU is the same size as the SCU, no split flag is encoded in the bit stream for that CU.
The entropy encoder 334 includes coded QP information for each coded CU in the compressed bit stream. In some embodiments, the entropy encoder 334 includes this QP information in the form of a delta QP value, i.e., the difference between a QP value and a predicted QP value. In some embodiments, the entropy encoder 334 computes a delta QP for a CU as dQP=QPcurr−QPprev where QPcurr is the QP value for the CU and QPprev is the QP value for the CU immediately preceding the CU in the scanning order, e.g., in depth-first Z scan order. In this case, QPprev is the predicted QP. For example, referring to
In some embodiments, the entropy encoder 334 computes a value for delta QP as a function of the QP values of one or more spatially neighboring QPs. That is, delta QP=QPcurr−f(QPs of spatially neighboring CUs). In this case, f( ) provides the predicted QP value. Computing delta QP in this way may be desirable when rate control is based on perceptual criteria. Examples of the function f( ) include f( )=QP of a left neighboring CU and f( )=the average of the QP value for a left neighboring CU and the QP value of a top neighboring CU. More sophisticated functions of QPs of spatially neighboring CUs may also be used, including using the QP values of more than one or two neighboring CUs.
Within an LCU, spatially neighboring CUs of a CU may be defined as those CUs adjacent to the CU in the CU structure of the LCU. For example, in
In some embodiments, more than one mode for computing a predicted QP value for purposes of computing delta QP may be provided. For example, the entropy encoder 334 may provide two different modes for computing delta QP: dQP=QPcurr−QPprev and dQP=QPcurr−f(QPs of spatially neighboring CUs). That is, the entropy encoder 334 may compute a delta QP as per the following pseudo code:
where qp_predictor_mode is selected elsewhere in the video encoder. More than two modes for computing a delta QP value may be provided in a similar fashion. Further, the mode used to compute delta QPs, i.e., qp_predictor_mode, may be signaled in the compressed bit stream at the appropriate level, e.g., sequence, picture, slice, and/or LCU level.
In some embodiments, the entropy encoder 334 encodes a delta QP value for each CU in the compressed bit stream. For example, referring to
Referring again to
The combiner 338 adds the delayed selected CU to the reconstructed residual CU to generate an unfiltered reconstructed CU, which becomes part of reconstructed picture information. The reconstructed picture information is provided via a buffer 328 to the intra prediction component 324 and to a filter component 316. The filter component 316 is an in-loop filter which filters the reconstructed frame information and provides filtered reconstructed CUs, i.e., reference data, to the storage component 318.
In some embodiments, the above described techniques for computing delta QPs may be used in other components of the video encoder. For example, if the quantize component uses rate distortion optimized quantization which minimizes total rate and distortion for a CU (Total rate=Rate of (dQP)+Rate for (CU)), one or both of these techniques may be used by these components to compute the needed delta QP values. In some embodiments, the QPs originally generated by the rate control component 344 may be adjusted up or down by one or more other components in the video encoder prior to quantization.
In the video decoder of
If the video encoder computed a delta QP as QPcurr−f(QPs of spatially neighboring CUs), the entropy decoding component 900 computes a reconstructed QP as the delta QP+f(rQPs of spatially neighboring CUs), where rQP is a reconstructed QP. Further, if the video encoder supports multiple modes for computing a delta QP, the video decoder will compute a reconstructed QP from the delta QP according to the mode signaled in the bit stream.
To perform the computation delta QP=QPcurr−f(rQPs of spatially neighboring CUs), the entropy decoding component 900 may store the reconstructed QPs of the appropriate spatially neighboring CUs. For example, the reconstructed QPs of the neighboring CUs may be stored in a reconstructed quantization parameter array in a manner similar to that of the previously described quantization parameter array.
Example reconstructed QP calculations are described below assuming f( ) is equal to the rQP of the left neighboring CU and in reference to the example LCU structures 1000 and 1002 in
rQP(A1)=dQP(A1)+rQP(B22 of LCU 0 1000)
rQP(A21)=dQP(A21)+rQP(A1)
rQP(A22)=dQP(A22)+rQP(A21)
rQP(A23)=dQP(A23)+rQP(A1)
rQP(A24)=dQP(A24)+rQP(A23)
rQP(A3)=dQP(A3)+rQP(B42 of LCU 0 1000)
rQP(A4)=dQP(A4)+rQP(A3)
In this example, the left column of the reconstructed quantization parameter array 1004 (B22, B24, B42, B44, D22, D24, D42, D44) is all that is required for applying predictor f( ) to LCU 1 1002. If the left neighboring CU is not available as can be the case for the first LCU in a picture, a predefined QP value may be used or the reconstructed QP value in CU coding order may be used.
Referring again to
A residual CU supplies one input of the addition component 906. The other input of the addition component 906 comes from the mode switch 908. When inter-prediction mode is signaled in the encoded video stream, the mode switch 908 selects a prediction block from the motion compensation component 910 and when intra-prediction is signaled, the mode switch selects a prediction block from the intra prediction component 914. The motion compensation component 910 receives reference data from storage 912 and applies the motion compensation computed by the encoder and transmitted in the encoded video bit stream to the reference data to generate a predicted CU. The intra-prediction component 914 receives previously decoded predicted CUs from the current picture and applies the intra-prediction computed by the encoder as signaled by a spatial prediction mode transmitted in the encoded video bit stream to the previously decoded predicted CUs to generate a predicted CU.
The addition component 906 generates a decoded CU, by adding the selected predicted CU and the residual CU. The output of the addition component 906 supplies the input of the in-loop filter component 916. The in-loop filter component 916 smoothes artifacts created by the block nature of the encoding process to improve the visual quality of the decoded frame. The output of the in-loop filter component 916 is the decoded frames of the video bit stream. Each decoded CU is stored in storage 912 to be used as reference data.
In some embodiments, unit transforms smaller than a CU may be used. In such embodiments, the video encoder may further partition a CU into transform units. For example, a CU may be partitioned into smaller transform units in accordance with a recursive quadtree structure adapted to the CU size. The transform unit structure of the CU may be signaled to the decoder in a similar fashion as the LCU CU structure using transform split flags. Further, in some such embodiments, delta QP values may be computed and signaled at the transform unit level. In some embodiments, a flag indicating whether or not multiple quantization parameters are provided for an LCU may be signaled at the appropriate level, e.g., sequence, picture, and/or slice.
CUs in the CU structure are then coded using the corresponding QPs 1104. For example, a block-based coding process, i.e., prediction, transformation, and quantization, is performed on each CU in the CU structure. The prediction, transformation, and quantization may be performed on each CU as previously described herein.
The QPs used in coding the CUs are also coded 1106. For example, to signal the QPs used in coding the CUs, delta QPs may be computed. The delta QP values may be computed as previously described. The coded QPs, the coded CUs, and the CU structure are then entropy coded to generate a portion of the compressed bit stream 1108. The coded QPs, coded CUs, and the CU structure may be signaled in the compressed bit stream as previously described herein.
The techniques described in this disclosure may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the software may be executed in one or more processors, such as a microprocessor, application specific integrated circuit (ASIC), field programmable gate array (FPGA), or digital signal processor (DSP). The software that executes the techniques may be initially stored in a computer-readable medium such as compact disc (CD), a diskette, a tape, a file, memory, or any other computer readable storage device, and loaded and executed in the processor. In some cases, the software may also be sold in a computer program product, which includes the computer-readable medium and packaging materials for the computer-readable medium. In some cases, the software instructions may be distributed via removable computer readable media (e.g., floppy disk, optical disk, flash memory, USB key), via a transmission path from computer readable media on another digital system, etc.
Embodiments of the methods and encoders as described herein may be implemented for virtually any type of digital system (e.g., a desk top computer, a laptop computer, a handheld device such as a mobile (i.e., cellular) phone, a personal digital assistant, a digital camera, etc.).
As shown in
The display 1320 may also display pictures and video sequences received from a local camera 1328, or from other sources such as the USB 1326 or the memory 1312. The SPU 1302 may also send a video sequence to the display 1320 that is received from various sources such as the cellular network via the RF transceiver 1306 or the camera 1326. The SPU 1302 may also send a video sequence to an external video display unit via the encoder unit 1322 over a composite output terminal 1324. The encoder unit 1322 may provide encoding according to PAL/SECAM/NTSC video standards.
The SPU 1302 includes functionality to perform the computational operations required for video encoding and decoding. In one or more embodiments, the SPU 1302 is configured to perform computational operations for applying one or more techniques for generating and encoding multiple quantization parameters for an LCU during the encoding process as described herein. Software instructions implementing the techniques may be stored in the memory 1312 and executed by the SPU 1302, for example, as part of encoding video sequences captured by the local camera 1328. In some embodiments, the SPU 1302 is configured to perform computational operations for applying one or more techniques for decoding multiple quantization parameters for an LCU as described herein as part of decoding a received coded video sequence or decoding a coded video sequence stored in the memory 1312. Software instructions implementing the techniques may be stored in the memory 1312 and executed by the SPU 1302.
The steps in the flow diagrams herein are described in a specific sequence merely for illustration. Alternative embodiments using a different sequence of steps may also be implemented without departing from the scope and spirit of the present disclosure, as will be apparent to one skilled in the relevant arts by reading the disclosure provided herein.
While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this disclosure, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as disclosed herein. Accordingly, the scope of the invention should be limited only by the attached claims. It is therefore contemplated that the appended claims will cover any such modifications of the embodiments as fall within the true scope and spirit of the invention.
Claims
1. A method of video processing, comprising:
- dividing a picture into a plurality of non-over-lapping blocks;
- for a first of the plurality of non-over-lapping blocks of a first size, determine a minimum coding unit size for which a first quantization parameter will be determined wherein the minimum coding unit size is less than the first size;
- transforming the plurality of non-over-lapping blocks into a plurality of transformed coefficients in a frequency domain using a transform function;
- quantizing the plurality of transformed coefficients using a plurality of quantization parameters at least one of which is the first quantization parameter; and
- encoding the plurality of quantized transformed coefficients into a compressed bit stream and signaling at a picture level in the compressed bit stream the minimum coding unit size for which the first quantization parameter is determined for the first non-over-lapping block.
2. The method of claim 1 further comprising transmitting the compressed bit stream on a communications channel.
3. The method of claim 1 further comprising:
- performing an inverse quantization operation on the plurality of quantized transformed coefficients using the plurality of quantization parameters to form a plurality of reconstructed transformed coefficients; and
- performing an inverse transform operation on the plurality of reconstructed transformed coefficients to form a plurality of reconstructed non-over-lapping blocks.
4. The method of claim 1 further comprising;
- for a second of the plurality of non-over-lapping blocks of a second size determine a second minimum coding unit size for which a second quantization parameter will be determined wherein the second minimum coding unit size is less than the second size; quantizing the plurality of transformed coefficients using the plurality of quantization parameters including the second quantization parameter; and signaling at the picture level in the encoded bit stream bit stream the minimum coding unit size for which the second quantization parameter is determined for the second non-over-lapping block.
5. A video system, comprising:
- a coding component configured to divide a picture into a plurality of non-over-lapping blocks;
- a rate control component coupled to the coding component and configured to determine a minimum coding unit size for which a first quantization parameter will be determined for a first of the plurality of non-over-lapping blocks of a first size, wherein the minimum coding unit size is less than the first size;
- a transform component coupled to the rate control component and configured to transform the plurality of non-over-lapping blocks into a plurality of transformed coefficients in a frequency domain using a transform function;
- a quantize component coupled to the transform component and configured to quantize the plurality of transformed coefficients using a plurality of quantization parameters at least one of which is the first quantization parameter; and
- an encoder coupled to the quantize component and configured to encode the plurality of quantized transformed coefficients into a bit stream and signal at a picture level in the encoded bit stream the minimum coding unit size for which the first quantization parameter is determined for the first non-over-lapping block.
6. The video system of claim 5 further comprising:
- a dequantize component coupled to the quantize component and configured to perform an inverse quantization operation on the plurality of quantized transformed coefficients using the plurality of quantization parameters to form a plurality of reconstructed transformed coefficients; and
- an inverse transform component coupled to the dequantize component and configure to perform an inverse transform operation on the plurality of reconstructed transformed coefficients to form a plurality of reconstructed non-over-lapping blocks.
7. The video system of claim 5 where the video system is implemented on a signal processing unit comprising a software program implemented in one or more processors wherein the software program is stored in a memory and loaded and executed in the one or more processors.
8. The video system of claim 7 wherein the signal processing unit further comprises an embedded memory and at least one security feature.
9. The video system of claim 5 further comprising a video capture component coupled to the coding component and configured to provide a video sequence comprising a plurality of pictures to the coding component.
10. The video system of claim 9 wherein the video capture component is a charge coupled device (CCD) camera or a complementary metal oxide semiconductor (CMOS) camera.
11. The video system of claim 9 further comprising a transmitter coupled to encoder for transmitting the bit stream on a communications channel.
12. A method of video processing, comprising:
- receiving an encoded bit stream for a picture;
- decoding the encoded bit stream and determine from a parameter signaled in the bit stream at a picture level a first minimum coding unit size for which a first quantization parameter is determined for a first non-over-lapping block of a plurality of non-overlapping blocks;
- performing an inverse quantization operation on a first plurality of quantized transformed coefficients using a plurality of quantization parameters including the first quantization parameter to form a plurality of reconstructed transformed coefficients; and
- performing an inverse transform operation on the plurality of reconstructed transformed coefficients to form a plurality of reconstructed non-over-lapping blocks with a plurality of pixel values.
13. The method of claim 12 further comprising determining a second minimum coding unit size for which a second quantization parameter is determined for a second non-over-lapping block of the plurality of non-overlapping blocks; and
- performing the inverse quantization operation on the plurality of quantized transformed coefficients using the plurality of quantization parameters including the second quantization parameter to form the plurality of reconstructed transformed coefficients.
14. The video method of claim 12 where the video processing is implemented on a signal processing unit comprising a software program implemented in one or more processors wherein the software program is stored in a memory and loaded and executed in the one or more processors.
15. The video method of claim 14 wherein the signal processing unit further comprises an embedded memory and at least one security feature.
16. The video method of claim 12 further comprising displaying the picture on a display.
17. The video method claim 16 wherein the display is a liquid crystal display (LCD) or a light emitting diode (LED) display.
18. A video decoding system, comprising:
- a decoding component configured to receive an encoded bit stream for a picture;
- the decoding component further configured to decode the encoded bit stream and determine from a parameter signaled in the bit stream at a picture level a minimum coding unit size for which a first quantization parameter is determined for a first non-over-lapping block of a plurality of non-overlapping blocks;
- an inverse quantization component coupled to the decoding component and configured to perform an inverse quantization operation on a plurality of quantized transformed coefficients using a plurality of quantization parameters including the first quantization parameter to form a plurality of reconstructed transformed coefficients; and
- an inverse transform component coupled to the inverse quantization component and configured to perform an inverse transform operation on the plurality of reconstructed transformed coefficients to form a plurality of reconstructed non-over-lapping blocks with a plurality of pixel values.
19. The video decoding system of claim 18 where the video decoding system is implemented on a signal processing unit comprising a software program implemented in one or more processors wherein the software program is stored in a memory and loaded and executed in the one or more processors.
20. The video decoding system of claim 19 wherein the signal processing unit further comprises an embedded memory and a plurality of security features.
21. The video decoding system of claim 18 further comprising a display coupled to the inverse transform component and configured to display the picture.
22. The video decoding system of claim 21 wherein the display is a liquid crystal display (LCD) or a light emitting diode (LED) display.
20050002454 | January 6, 2005 | Ueno |
20070280348 | December 6, 2007 | Tokumitsu et al. |
20080075184 | March 27, 2008 | Muharemovic |
20090213930 | August 27, 2009 | Ye et al. |
20100086029 | April 8, 2010 | Chen et al. |
20100086030 | April 8, 2010 | Chen et al. |
20100086031 | April 8, 2010 | Chen et al. |
20100086032 | April 8, 2010 | Chen et al. |
20110194613 | August 11, 2011 | Chen |
20110255597 | October 20, 2011 | Mihara et al. |
20120114034 | May 10, 2012 | Huang |
WO 2012042890 | April 2012 | WO |
WO 2012062161 | May 2012 | WO |
- Detlev Marpe et al, “Video Compression Using Nested Quadtree Structures, Leaf Merging and Improved Techniques for Motion Representation and Entropy Coding”, IEEE Transactions on Circuits and Systems for Video Technology, vol. 20, No. 12, Dec. 2010, published Nov. 15, 2010, pp. 1676-1687.
- Ken McCann et al, “Samsung's Response to the Call for Proposals on Video Compression Technology”, JCTVC-A124, Apr. 15-23, 2010, pp. 1-42, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, Dresden, Germany.
- Chao Pang et al, “Improved dQP Calculation Method”, JCTVC-E217, Mar. 16-23, 2011, pp. 1-3, Joint Collaborative Team on Video Coding (JCT-VC) of ITU—T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, Geneva, Switzerland.
- Kazushi Sato and Jun Xu, “Preliminary Implementation on Sub-LCU-Level DeltaQP”, JCTVC-E220r2, Mar. 16-23, 2011, pp. 1-5, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, Geneva, Switzerland.
- Muhammed Coban et al, “CU-Level QP Prediction”, JCTVC-E391, Mar. 16-23, 2011, pp. 1-3, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, Geneva, Switzerland.
- Hirofumi Aoki et al, “Prediction-based QP Derivation”, JCTVC-E215, pp. 1-11, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, Geneva, Switzerland.
- Chao Pang et al, “Sub-LCU QP Representation”, JCTVC-E436, Mar. 16-23, 2011, pp. 1-5, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC2/WG11, Geneva, Switzerland.
- Masaaki Kobayashi and Masato Shima, “Sub-LCU Level Delta QP Signaling”, JCTVC-E198, pp. 1-9, Mar. 16-23, 2011, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, Geneva, Switzerland.
- Masato Shima et al, “Support for Sub-LCU-Level QP in HEVC”, JCTVC-E202r1, pp. 1-5, Mar. 16-23, 2011, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, Geneva, Switzerland.
- Tzu-Der Chuang et al, “AhG Quantization: Sub-LCU Delta QP”, JCTVC-E051, pp. 1-6, Mar. 16-23, 2011, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, Geneva, Switzerland.
- Thomas Wiegand et al, “WD3: Working Draft 3 of High-Efficiency Video Coding”, JCTVC-E603, pp. 1-168, Mar. 16-23, 2011, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, Geneva, Switzerland.
- Madhukar Budagavi and Minhua Zhou, “Delta QP Signaling at Sub-LCU Level”, JCTVC-D038, pp. 1-5, Jan. 20-28, 2011, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, Daegu, South Korea.
Type: Grant
Filed: May 3, 2016
Date of Patent: Apr 25, 2017
Patent Publication Number: 20160249062
Assignee: TEXAS INSTRUMENTS INCORPORATED (Dallas, TX)
Inventors: Minhua Zhou (San Diego, CA), Mehmet Demircin (Dallas, TX), Madhukar Budagavi (Plano, TX)
Primary Examiner: Gims Philippe
Assistant Examiner: Joseph Becker
Application Number: 15/145,637
International Classification: H04N 7/26 (20060101); H04N 19/124 (20140101); H04N 19/176 (20140101); H04N 19/196 (20140101); H04N 19/463 (20140101); H04N 19/167 (20140101); H04N 19/60 (20140101); H04N 19/172 (20140101); H04N 19/184 (20140101); H04N 19/117 (20140101); H04N 19/13 (20140101); H04N 19/15 (20140101); H04N 19/159 (20140101); H04N 19/43 (20140101); H04N 19/61 (20140101); H04N 5/232 (20060101); H04N 5/369 (20110101);