METHOD AND APPARATUS FOR ENCODING AND DECODING MULTI-VIEW IMAGES BASED ON GLOBAL DISPARITY VECTOR
A method and apparatus for encoding and decoding multi-view images. The multi-view image encoding method selects a block corresponding to a current block from another picture having a view-point which is different from a view-point of a current picture to which the current block belongs, on the basis of a global disparity vector representing a global disparity between the current picture and the other picture; and encodes the current block on the basis of block information of a block from among the selected block and blocks adjacent to the selected block. Accordingly, multi-view images can be encoded in consideration of the individual differences between the appearances of objects as well as global disparities between view-points.
Latest Samsung Electronics Patents:
This is a Continuation application of U.S. application Ser. No. 11/986,868, filed Jan. 3, 2008, in the U.S. Patent and Trademark Office, which claims priority from Korean Patent Application No. 10-2007-0033781, filed on Apr. 5, 2007, in the Korean Intellectual Property Office, Korean Patent Application No. 10-2007-0129086, filed on Dec. 12, 2007, in the Korean Intellectual Property Office and U.S. Provisional Application No. 60/883,190, filed on Jan. 3, 2007, the disclosures of which are incorporated herein in their entirety by reference.
BACKGROUND OF THE INVENTION1. Field of the Invention
Methods and apparatuses consistent with the present invention relate to encoding and decoding multi-view images, and more particularly, to a method and apparatus for encoding and decoding multi-view images with a high efficiency using a global disparity vector.
2. Description of the Related Art
Multi-view image encoding is performed by receiving image signals from a plurality of cameras that provide multi-view images and encoding the image signals. The multi-view images are compression-encoded using temporal correlation and spatial correlation between cameras (inter-view).
Temporal prediction is performed using temporal correlation between pictures at the same view-point, which have been taken at different times, and spatial prediction is performed using spatial correlation between pictures having different view-points, which have been taken at the same time.
A method of using such spatial correlation includes a prediction encoding method using a global disparity vector. The prediction encoding method using the global disparity vector will be described in detail with reference to
A global disparity between the image 110 of
In global disparity compensation, a current picture is prediction-encoded with reference to a reference picture, wherein the reference picture is moved in a one-dimensional or two-dimensional direction by a global disparity between the reference picture and the current picture. However, the global disparity compensation does not consider the individual difference of each object which is included in the reference picture and the current picture.
An image photographed by a first camera 210 is illustrated in
Although a global disparity between images having different view-points is considered when motion compensation is performed, objects included in the images having the different view-points can have different disparities other than a global disparity. Accordingly, a multi-view image encoding method considering such a problem is needed.
SUMMARY OF THE INVENTIONThe present invention provides a method and apparatus for encoding and decoding multi-view images, considering the individual differences in the appearance of objects included in the multi-view images, and a computer-readable recording medium having embodied thereon a program for executing the multi-view image encoding/decoding method.
According to an aspect of the present invention, there is provided a multi-view image encoding method comprising selecting a block corresponding to a current block from another picture having a view-point which is different from a view-point of a current picture to which the current block belongs, on the basis of a global disparity vector representing a global disparity between the current picture and the other picture; and encoding the current block on the basis of block information of a block from among the selected block and blocks adjacent to the selected block.
The block information may comprise at least one of a motion vector, a reference index, and a prediction mode used to encode a block from among the selected block and the blocks adjacent to the selected block.
The encoding of the current block may comprise encoding flag information indicating that the current block is encoded on the basis of block information of a block from among the selected block and the blocks adjacent to the selected block.
The encoding of the current block may further comprise encoding index information indicating which block from among the selected block and the blocks adjacent to the selected block, corresponds to the block information used to encode the current block.
The encoding of the current block may further comprise entropy-encoding the flag information and the index information.
According to another aspect of the present invention, there is provided a multi-view image encoding apparatus comprising a selection unit which selects a block corresponding to a current block from another picture having a view-point which is different from a view-point of a current picture to which the current block belongs, on the basis of a global disparity vector representing a global disparity between the current picture and the other picture; and an encoding unit which encodes the current block on the basis of block information of a block from among the selected block and blocks adjacent to the selected block.
According to another aspect of the present invention, there is provided a multi-view image decoding method comprising receiving a bit stream including data about a current block which has been encoded on the basis of block information of a block from among a selected block and blocks adjacent to the selected block, wherein the selected block corresponding to the current block is selected from another picture having a view-point which is different from a view-point of a current picture to which the current block belongs, on the basis of a global disparity vector representing a global disparity between the current picture and the other picture; extracting the data about the current block from the bit stream; and restoring the current block on the basis of the data about the current block.
The data about the current block may comprise flag information indicating that the current block is encoded on the basis of the block information of a block from among the selected block and the blocks adjacent to the selected block.
The data about the current block may further comprise index information indicating which block from among the selected block and the blocks adjacent to the selected block, corresponds to the block information used to encode the current block.
According to another aspect of the present invention, there is provided a multi-view image decoding apparatus comprising a decoding unit which receives a bit stream including data about a current block which has been encoded on the basis of block information of a block from among a selected block and blocks adjacent to the selected block, wherein the selected block corresponding to the current block is selected from another picture having a view-point which is different from a view-point of a current picture to which the current block belongs, on the basis of a global disparity vector representing a global disparity between the current picture and the other picture, and which extracts the data about the current block from the bit stream; and a restoring unit which restores the current block on the basis of the data about the current block.
According to another aspect of the present invention, there is provided a multi-view image encoding method comprising selecting a block corresponding to a current block from another picture having a view-point which is different from a view-point of a current picture to which the current block belongs, on the basis of a global disparity vector representing a global disparity between the current picture and the other picture; encoding the current block according to an encoding mode in which a block is encoded on the basis of block information of the selected block; and encoding flag information indicating that at least one of blocks contained in a current slice to which the current block belongs has been encoded according to the encoding mode.
According to another aspect of the present invention, there is provided a multi-view image encoding apparatus comprising a selection unit which selects a block corresponding to a current block from another picture having a view-point which is different from a view-point of a current picture to which the current block belongs, on the basis of a global disparity vector representing a global disparity between the current picture and the other picture; and an encoding unit which encodes the current block according to an encoding mode in which a block is encoded on the basis of block information of the selected block, wherein the encoding unit further encodes flag information indicating that at least one of blocks contained in a current slice to which the current block belongs has been encoded according to the encoding mode.
According to another aspect of the present invention, there is provided a multi-view image decoding method comprising receiving a bit stream including data about a current block which has been encoded on the basis of block information of a block corresponding to the current block, wherein the corresponding block is selected from another picture having a view-point which is different from a view-point of a current picture to which the current block belongs, on the basis of a global disparity vector representing a global disparity between the current picture and the other picture; extracting data regarding a residual block of the current block and information regarding an encoding mode from the received bitstream; and restoring the current block on the basis of the data about the residual block and the information regarding the encoding mode, wherein the information regarding the encoding mode comprises flag information indicating that the at least one of blocks included in a current slice to which the current block belongs has been encoded according to the encoding mode.
According to another aspect of the present invention, there is provided a multi-view image decoding apparatus comprising a decoding unit which receives a bit stream including data about a current block which has been encoded on the basis of block information of a block corresponding to the current block, and then extracts data about a residual block of the current block and information regarding an encoding mode from the bit stream, where the corresponding block is selected from another picture having a view-point which is different from a view-point of a current picture to which the current block belongs, on the basis of a global disparity vector representing a global disparity between the current picture and the other picture; and a restoring unit which restores the current block on the basis of the data about the residual block and the information regarding the encoding mode, wherein the information regarding the encoding mode comprises flag information indicating that at least one of blocks included in a current slice to which the current block belongs has been encoded according to the encoding mode.
According to another aspect of the present invention, there is provided a computer-readable recording medium having embodied thereon a program for executing the above methods.
The above and other features and advantages of the present invention will become apparent and more readily appreciated from the following description of the exemplary embodiments, taken in conjunction with the accompanying drawings of which:
Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.
The multi-view image encoding apparatus 300 includes a selection unit 310, an encoding unit 320, and a frame memory 330.
The selection unit 310 selects a block corresponding to a current block from another picture having a view-point which is different from a view-point of a current picture to which the current block belongs, on the basis of a vector representing a global disparity between the current picture and the other picture. In detail, the selection unit 310 selects a macro block corresponding to a current macro block from another picture having a neighboring view-point which is different from and adjacent to a view-point of a current picture to which the current block belongs, according to a global disparity vector representing a global disparity between the current picture and the other picture which has been previously encoded.
The other picture having the neighboring view-point has been previously encoded in units of predetermined blocks. Block information of the other picture is generated in units of encoded blocks. In order to encode a current block using the block information of another block, a block having block information has to be selected as a block corresponding to the current block. Accordingly, if a global disparity vector is not a multiple of the size of a current block in an x-axis or y-axis direction, a block corresponding to the current block is selected according to the result obtained by rounding the global disparity vector.
For example, it is assumed that a current block is a 16×16 macro block, and another picture having a view-point which is different from a view-point of a current picture to which the current block belongs is encoded by generating block information in units of 16×16 macro blocks. If a global disparity vector is calculated to be (x, y)=(10, 12), a block corresponding to the current block becomes a block at a location moved by (x, y)=(10, 12) from the location of the current block in the other picture having the different view-point. However, “10” and “12” are not multiples of the size of the current block, and the block at the location moved by (x, y)=(10, 12) is located over a plurality of macro blocks and has no independent block information. Accordingly, by rounding “10” and “12” to “16”, a macro block at a location moved by (x, y)=(16, 16) from the location of the current macro block is selected from the other picture having the different view-point (that is, a neighboring view-point).
The method of selecting the block corresponding to the current block by rounding the X and Y components of the global disparity vector is provided as an example, and the present invention is not limited thereto.
The selection unit 310 can also use a variety of methods for searching for a neighboring view-point to select a block corresponding to a current block, and can determine a neighboring view-point using different methods for each sequence of multi-view images. This operation will be described in detail with reference to
Referring to
However, the method described above with reference to
Returning to
The type of the block information that is to be used for encoding the current block is not limited. For example, the block information can include at least one of a prediction mode used to encode the selected block, a motion vector used for inter-view prediction or temporal prediction, and a reference index.
Since the current block is encoded using the block information of another block, block information for the current block needs not to be generated. Details thereof will be described later with reference to
If it is determined that encoding the current block according to the block information of the corresponding block has a low rate-distortion (R-D) cost rather than encoding the current block using intra prediction, inter-view prediction, or temporal prediction which is a general prediction method for multi-view images, the current block is encoded using the block information of the corresponding block.
Referring to
Accordingly, the encoding unit 320 can encode the current block 510 by using not only block information of the block 530 selected according to the global disparity vector but also block information of the blocks 540 that are adjacent to the selected block 530.
The number of the adjacent blocks 540 whose block information is used for encoding can be changed. As illustrated in
Indexes are assigned to the block 530 and the adjacent blocks 540, in order to indicate a block whose block information is used to encode the current block 510. This will be described later in more detail with reference to
Returning to
Referring to
The flag information encoder 610 encodes flag information indicating an encoding mode of a current block that is to be encoded on the basis of block information of at least one from among a block selected by the selection unit 310 (see
Since a current block is encoded using block information of another block selected according to a global disparity vector, block information of the current block needs not to be inserted into a bit stream. The reason for this is that the flag information substitutes for block information of the current block. The flag information may be contained in a macro block header or a slice header. A method of inserting flag information into a macro block header will be described later in detail with reference to
In addition to the flag information contained in the macro block header, other flag information may be inserted into a slice header so that whether an encoding mode illustrated in
For example, “MB_info_skip_enable” syntax may be added to a slice header. If “MB_info_skip_enable=1”, it means that at least one from among blocks contained in a current slice has been encoded according to an encoding mode in which one block is encoded using block information of another block. If “MB_info_skip_enable=0”, it means that none of the blocks contained in a current slice have been encoded according to the encoding mode. In this case, ‘mbinfo_skip_flag’ and ‘ref_mb_pos’ syntaxes, which will be described later with reference to
The index information encoder 620 encodes index information indicating which block, from among the block selected according to the global disparity vector and the blocks adjacent to the selected block, corresponds to the block information used to encode the current block. The index information is needed when the current block is encoded using not only the block information of the block selected by the selection unit 310 but also the block information of the blocks adjacent to the selected block.
The encoding unit 320 can encode the current block, using not only block information of the block selected by the global disparity vector but also the blocks adjacent to the selected block. Thus, in this case, the encoding unit 320 also encodes index information designating a block that is to be used to encode the current block, from among the blocks.
The entropy encoder 630 encodes the flag information generated by the flag information encoder 610 and the index information generated by the index information encoder 620. Then, the entropy encoder 630 binarizes the flag information and the index information, performs context-based adaptive arithmetic coding (CABAC) on the binarized flag information and index information, and inserts the result into a bit stream.
In
The flag information “mbinfo_skip_flag” indicates whether a current block has been encoded using block information of a block from among a selected block according to a global disparity vector and blocks adjacent to the selected block.
Since a current block is encoded using block information of another block, block information of the current block needs not to be generated. Accordingly, flag information indicating that block information of the current block is skipped is generated as the flag information “mbinfo_skip_flag”.
The index information “ref_mb_pos” indicates a block whose block information is used to encode the current block. If the current block is encoded based on not only the block selected by the global disparity vector but also the blocks adjacent to the selected block, the index information “ref_mb_pos” indicates which block from among the above blocks corresponds to the block information used to encode the current block.
The index information “ref_mb_pos”, which indicates which block from among the selected block (index 0) and the adjacent blocks (indexes 1 through 8) illustrated in
Referring to
Returning to
Referring to
In operation 820, the current block is encoded, using block information of a block from among the selected block and blocks adjacent to the selected block. That is, the current block can be encoded based on either only the block information of the selected block or the block information of a block from among the selected block and the adjacent blocks.
The current block is encoded directly using the block information. Here, flag information regarding an encoding mode of the current block, and index information indicating which block corresponds to the block information used to encode the current block, are also encoded and inserted into a bit stream. The flag information and index information are entropy-encoded and inserted into the bit stream. The flag information that is encoded may be flag information inserted into a macro block header, as illustrated in
Referring to
The decoding unit 910 receives a bit stream including data about a current block, and extracts the data about the current block from the bit stream.
The data about the current block includes data about a residual block generated by prediction-encoding the current block, and information regarding an encoding mode of the current block. The residual block is generated by selecting a block corresponding to the current block from another picture having a view-point which is different from the view-point of the current picture, on the basis of a global disparity vector indicating a global disparity between a current picture to which the current block belongs and the other picture, and prediction-encoding the current block on the basis of block information of a block from among the selected block and blocks adjacent to the selected block.
The information regarding an encoding mode of the current block includes flag information indicating that the current block is encoded according to an encoding mode according to the present invention, and index information for designating a block used to encode the current block. The index information is included when the block information of the adjacent blocks is also used in the encoding of the current block.
The decoding unit 910 determines whether a current slice includes a block encoded using an encoding method according to the present invention, based on flag information contained in a slice header. If the flag information in the slice header indicates that the current slice does not include a block encoded using the encoding method, there is no need to extract the flag information and the index information from the block header of the current block.
Then, the decoding unit 910 performs entropy decoding, inverse-transformation, and dequantization on the data about the residual block, thereby decoding the data about the residual block. The decoding unit 910 performs entropy-decoding (that is, CABAC decoding) on the flag information and the index information of the current block.
The restoring unit 920 restores the current block on the basis of the data about the current block extracted by the decoding unit 910.
The block information of the current block includes the flag information and the index information. Thus, the restoring unit 920 restores the current block, using block information of a block among a block having a neighboring view-point corresponding to the current block and blocks adjacent to the block having the neighboring view-point, with reference to the flag information in the slice header, the flag information in the block header, and the index information.
The restoring unit 920 predicts the current block in a prediction mode according to the block information. The restoring unit 920 adds the resultant predicted block to the residual block extracted by the decoding unit 910, and thus restores the current block.
Referring to
That is, a bit stream including data about a current block, which is obtained by selecting a block corresponding to the current block from another picture having a neighboring view-point on the basis of a global disparity vector, and encoding the current block on the basis of block information of a block from among the selected block and blocks adjacent to the selected block, is received.
In operation 1020, the data about the current block is extracted from the bit stream. That is, data about a residual block of the current block, and information of an encoding mode of the current block are extracted from the bit stream. The block information of the current block may include flag information indicating that the current block is encoded according to an encoding mode according to the present invention, and index information for designating a block used to encode the current block. Also, block information and flag information are respectively extracted from a macro block header and a slice header.
In operation 1030, the current block is restored on the basis of the data about the current block. That is, the current block is restored using the flag information included in the block information of the current block and block information of the block designated according to the index information.
The present invention can also be embodied as computer readable codes on a computer readable recording medium. The computer readable recording medium is any data storage device that can store data which can be thereafter read by a computer system. Examples of the computer readable recording medium include, but are not limited to: read-only memory (ROM), random-access memory (RAM), CD-ROMs, magnetic tapes, floppy disks, and optical data storage devices. The computer readable recording medium can also be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion.
As described above, according to the exemplary embodiment of the present invention, since multi-view images are encoded using block information of a block corresponding to the current block having a view-point which is different from a view-point of the current picture, and block information of blocks adjacent to the corresponding block, the multi-view images can be encoded in consideration of the individual differences in the appearance of objects as well as global disparities between view-points.
Since a current block can also be encoded using block information of another block, the number of bits required to encode block information can be reduced and a compression rate of encoding can be improved.
Since whether an encoding mode in which encoding is performed directly using block information of another block is applied can be controlled in units of slices, information regarding the encoding mode can be more effectively encoded and decoded.
While a few exemplary embodiments of the present invention have been particularly shown and described, it will be understood by those of ordinary skill in the art that various changes may be made to these exemplary embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.
Claims
1. A multi-view image encoding method comprising:
- selecting a block corresponding to a current block from another picture having a view-point which is different from a view-point of a current picture to which the current block belongs; and
- encoding the current block on the basis of block information of a block from among the selected block and blocks adjacent to the selected block.
2. The method of claim 1, wherein the block information comprises at least one of a motion vector, a reference index, and a prediction mode used to encode a block from among the selected block and the blocks adjacent to the selected block.
3. The method of claim 1, wherein the encoding of the current block comprises encoding flag information indicating that the current block is encoded on the basis of block information of a block from among the selected block and the blocks adjacent to the selected block.
4. The method of claim 3, wherein the encoding of the current block further comprises encoding index information indicating which block from among the selected block and the blocks adjacent to the selected block, corresponds to the block information used to encode the current block.
5. The method of claim 4, wherein the encoding of the current block further comprises entropy-encoding the flag information and the index information.
6. The method of claim 5, wherein the entropy-encoding of the flag information and the index information comprises performing context-based adaptive arithmetic coding (CABAC) on the flag information and the index information.
7. A multi-view image encoding apparatus comprising:
- a selection unit which selects a block corresponding to a current block from another picture having a view-point which is different from a view-point of a current picture to which the current block belongs; and
- an encoding unit which encodes the current block on the basis of block information of a block from among the selected block and blocks adjacent to the selected block.
8. The apparatus of claim 7, wherein the block information comprises at least one of a motion vector, a reference index, and a prediction mode used to encode a block from among the selected block and the blocks adjacent to the selected block.
9. The apparatus of claim 7, wherein the encoding unit further comprises a flag information encoding unit which encodes flag information indicating that the current block is encoded on the basis of block information of a block from among the selected block and the blocks adjacent to the selected block.
10. The apparatus of claim 9, wherein the encoding unit encodes index information indicating which block of the selected block and the blocks adjacent to the selected block, corresponds to the block information used to encode the current block.
11. The apparatus of claim 10, wherein the encoding unit further comprises an entropy encoding unit which entropy-encodes the flag information and the index information.
12. The apparatus of claim 11, wherein the entropy-encoding unit performs context-based adaptive arithmetic coding (CABAC) on the flag information and the index information.
13. A multi-view image decoding method comprising:
- receiving a bit stream including data about a current block which has been encoded on the basis of block information of a block from among a selected block and blocks adjacent to the selected block, wherein the selected block corresponding to the current block is selected from another picture having a view-point which is different from a view-point of a current picture to which the current block belongs;
- extracting the data about the current block from the bit stream; and
- restoring the current block on the basis of the data about the current block.
14. The method of claim 13, wherein the data about the current block comprises flag information indicating that the current block is encoded on the basis of the block information of a block from among the selected block and the blocks adjacent to the selected block.
15. The method of claim 14, wherein the data about the current block further comprises index information indicating which block from among the selected block and the blocks adjacent to the selected block, corresponds to the block information used to encode the current block.
16. The method of claim 15, wherein the restoring of the current block restores the current block on the basis of block information of a block designated by the index information.
17. A multi-view image decoding apparatus comprising:
- a decoding unit which receives a bit stream including data about the current block which has been encoded on the basis of block information of a block from among a selected block and blocks adjacent to the selected block, wherein the selected block corresponding to the current block is selected from another picture having a view-point which is different from a view-point of a current picture to which the current block belongs, and which extracts the data about the current block from the bit stream; and
- a restoring unit which restores the current block on the basis of the data about the current block.
18. The apparatus of claim 17, wherein the data about the current block comprises flag information which indicates that the current block is encoded on the basis of the block information of a block from among the selected block and the blocks adjacent to the selected block.
19. The apparatus of claim 18, wherein the data about the current block further comprises index information which indicates which block from among the selected block and the blocks adjacent to the selected block, corresponds to the block information used to encode the current block.
20. The apparatus of claim 19, wherein the restoring unit restores the current block on the basis of block information of a block designated by the index information.
Type: Application
Filed: Apr 12, 2012
Publication Date: Aug 9, 2012
Applicant: SAMSUNG ELECTRONICS CO., LTD. (Suwon-si)
Inventors: Hak-sup SONG (Suwon-si), Woo-sung SHIM (Yongin-si), Young-ho MOON (Suwon-si), Jong-bum CHOI (Yangju-si)
Application Number: 13/445,758
International Classification: G06K 9/36 (20060101);