JPEG packed block data structure for enhanced image processing
An intermediate data format which is readily convertible to or from JPEG compliant data streams and to or from image data provides, in most circumstances, accelerated encoding and decoding in a degree sufficient to allow additional processing without necessitating increase of processor power even in relatively time-critical applications such as high-speed printers and image browsers. The format features flags which indicate if S>8 (indicating that extra bits are required to uniquely encode an orthogonal transform coefficient value) or runs of zero-valued orthogonal transform coefficients longer than sixteen are present in a block of data. The block of data can be tested for these conditions and flags set once per block or once per image. Extensive processing can thus be omitted if either or both of these conditions are not present, as will generally be the case.
This application is a Continuation of U.S. patent application Ser. No. 09/896,110, filed Jul. 2, 2001, priority of which is hereby claimed under 35 U.S.C. §120. This application is also related to U.S. patent application Ser. No. 09/736,444, now U.S. Pat. No. 6,757,439 (Attorney's Docket RAL-99-0177), and U.S. Pat. No. 09/736,445, now U.S. Pat. No. 6,373,412 (Attorney's Docket END9-2000-0113US1), both filed Dec. 15, 2000, entitled JPEG Packed Block Structure and Fast JPEG Huffman Coding and Decoding, respectively, and U.S. patent application Ser. No. 09/896,117, entitled Faster Lossless Rotation of JPEG Images, filed concurrently herewith, all of which are assigned to the assignee of the present application and hereby fully incorporated by reference.
BACKGROUND OF THE INVENTION1. Field of the Invention
The present invention generally relates to image data compression and image data processing and, more particularly, to compression of image data in accordance with JPEG, MPEG or other image data standards in connection with reconstruction or other processing of information such as for merge, shift, rotation and the like.
2. Description of the Prior Art
Pictorial and graphics images contain extremely large amounts of information and, if digitized to allow transmission or processing by digital data processors, often requires many millions of bytes to represent respective pixels of the image or graphics with good fidelity. The purpose of image data compression is to represent images with less data in order to save storage costs or transmission time and costs. The most effective compression is achieved by approximating the original image, rather than reproducing it exactly. The JPEG standard, discussed in detail in “JPEG Still Image Data Compression Standard” by Pennebaker and Mitchell, published by Van Nostrand Reinhold, 1993, which is hereby fully incorporated by reference, allows the interchange of images between diverse applications and opens up the capability to provide digital continuous-tone color images in multi-media applications.
JPEG is primarily concerned with images that have two spatial dimensions, contain gray scale or color information, and possess no temporal dependence, as distinguished from the MPEG (Moving Picture Experts Group) standard which additionally exploits redundancy between frames for additional compression to meet motion picture and/or television frame rate demands. The JPEG standard has been developed as a flexible system for potentially providing the highest possible image fidelity for a given amount of data while allowing the amount of data representing the image to be reduced by a substantially arbitrary factor. The JPEG standard also allows substantial exploitation of relative sensitivities and insensitivities of human visual perception and it is not unusual for the JPEG standard to allow image data compression by a factor of twenty or more without significant perceptible image degradation.
At the same time, virtually no constraints are placed on processor resources or data processing methodologies so that improvements therein that result in reduced processing time will allow increased throughput and additional processing to be achieved in environments such as high speed printers where the printer will eject blank pages if the next complete page is not ready. Nevertheless, substantial data processing is required for encoding and decoding, particularly due to the need for statistical analyses of converted image values (e.g. discrete cosine transform (DCT) coefficients) in order to assure substantial data compression in accordance with the concept of entropy coding.
The concept of entropy coding generally parallels the concept of entropy in the more familiar context of thermodynamics where entropy quantifies the amount of “disorder” in a physical system. In the field of information theory, entropy is a measure of the predictability of the content of any given quantum of information (e.g. symbol) in the environment of a collection of data of arbitrary size and independent of the meaning of any given quantum of information or symbol.
This concept provides an achievable lower bound for the amount of compression that can be achieved for a given alphabet of symbols and, more fundamentally, leads to an approach to compression on the premise that relatively more predictable data or symbols contain less information than less predictable data or symbols and the converse that relatively less predictable data or symbols contain more information than more predictable data or symbols. Thus, assuming a suitable code for the purpose, optimally efficient compression can be achieved by allocating fewer bits to more predictable symbols or values (that are more common in the body of data and include less information) while reserving longer codes for relatively rare symbols or values.
By the same token, however, the JPEG standard and other image data compression standards have substantially no implications in regard to efficiency of data processing for encoding, decoding or other desired image manipulations beyond those expected from alteration of the volume of data to be processed, transmitted or stored. On the contrary, the very flexibility of coding provided by the JPEG standard requires substantial processing to determine details of the manner in which data is to be decoded, particularly in regard to portions of the coded data which represent variable length codes necessary to efficient data compression in accordance with the principles of entropy coding.
It has been found that some processing is, in fact, complicated by some intermediate data formats which are compatible with entropy encoding into the JPEG standard but not others which are similarly compatible. These standards specify the data streams but not the intermediate formats.
It should also be appreciated that image data compression standards such as the JPEG standard are principally directed toward facilitating exploitation of the trade-off between image fidelity and data transmission and processing time or required storage capacity. However, at the current time, some applications such as high performance printers and image browsers place high demands on both image fidelity and rapid data conversion. For example, high resolution color printers are foreseeable having such high printing speed that processing power at or exceeding the limits of current practicality is required. Such applications may also require additional processing such as image rotation or size change prior to image decoding for which, as a practical matter, no time is available.
Further, it should be appreciated that some loss of fidelity is unavoidable due to the quantization of image data for digital processing. Therefore, coding and decoding is, to some degree, lossy. This lossiness is acceptable for a single coding and decoding process since the nature of quantization can be freely chosen. However, multiple coding and decoding processes which may be necessitated by a need to perform certain image manipulations, such as rotation, on decoded data (that must again be encoded and decoded for efficient processing and storage and acceptable data processing time to reconstruct the image) generally cause substantial and readily perceptible image degradation.
SUMMARY OF THE INVENTIONIt is therefore an object of the present invention to provide a digital intermediate data format which is JPEG compatible which allows reduced processing time for decoding.
It is another object of the invention to provide a digital data intermediate format which is JPEG compatible which facilitates execution of DCT domain image processing algorithms.
It is a further object of the invention to provide a JPEG compatible digital data format which may be decoded in a simplified and consistent manner without imposing significant limitation on image fidelity or significant decrease in compression efficiency.
In order to accomplish these and other objects of the invention, a method of coding image data is provided including the steps of testing for coefficient values requiring more than eight bits to be uniquely coded, and using a flag in at least one block of data to indicate if all coefficient values in the block are coded in eight bits or fewer or if any requires more than eight bits to be uniquely coded.
In accordance with another aspect of the invention, a data format is provided including a first pair of bytes representing a block number, a Klast value and at least one flag indicating if all said coefficient values in said block are coded in eight bits or fewer or if any requires more than eight bits to be uniquely coded, and a second pair of bytes respectively representing an R/S value and a coefficient value.
BRIEF DESCRIPTION OF THE DRAWINGSThe foregoing and other objects, aspects and advantages will be better understood from the following detailed description of a preferred embodiment of the invention with reference to the drawings, in which:
Referring now to the drawings, and more particularly to
The image values (e.g. color, intensity, etc. in accordance with any image value or color coordinate system) are then quantized and a data transformation is performed such as a discrete cosine transformation (DCT) which provides values which are more easily compressed. For example, a DCT provides a number of DCT coefficients which are equal in number to the number of samples which make up the image block but many of the coefficients will be zero or near zero. After quantization, the near-zero coefficients will be zero. If these quantized coefficients are reordered into a so-called zig-zag order (of approximately increasing or decreasing spatial frequency in both the horizontal and vertical directions) such zero quantized values will often be grouped together in accordance with relative sensitivity of human perception. These groups or runs of zero or near zero values which can be expressed in very few bits or bytes which allows substantial data compression while minimizing the perceptibility of loss of image fidelity.
The data structure of
This format, when used for the JPEG compatible code in demanding applications has proved to be inefficient since the necessity of loading zero valued coefficients and test for non-zero values is computationally too expensive for the speeds demanded of these applications. By having to load and store many zero-valued coefficients, cache misses were induced, leading to an increase in the number of memory accesses and increased processing burden. The magnitude of this burden and avoidable memory hardware and operational requirements may be appreciated from the fact that many blocks have five or fewer non-zero coefficients.
Referring now to
Each non-zero AC coefficient is stored in two or more bytes. The first byte 24 is the R/S byte used for Huffman encoding, (i.e. the high order nibble R=four bits) equals the run of zero-valued AC coefficients in zig-zag order up to fifteen and the low order nibble S=four bits) is the number of extra bits necessary to uniquely code the non-zero magnitude. A preferred form of this packed format stores the extra bits in the next one or two bytes (e.g. E1 or E1 and E2) 25, depending on whether or not the second byte is needed (i.e. S>8). That is, E2 is an optional second byte which is only needed if S>8. The EOB byte is used if EOB1<64. Since the ZRLs and E2 are data dependent, data is accessed one byte at a time. An alternative implementation always follows the R/S byte with the actual AC coefficient value in two bytes. The final byte is the symbol 0x00 which indicates that an EOB is to be coded. ZRL is a byte 27 of the form 0xF0 used when the run of zero coefficients is greater than 15.
While both of the data formats of
Referring now to
The data format illustrated in
For those blocks in which there are no E2 s or ZRLs, the data format of
In addition, for the case where there are ZRLs (R>15 ), the ZRLs can be made to fit in two bytes, preserving synchronism, instead of one, two or three bytes. The format of
The preferred embodiment for the packed format in accordance with the invention is to make the coefficients fit into two or four bytes rather than two or three bytes to guarantee maintaining two byte synchronism. Some ways to pack two or four bytes are:
An alternative way to pack the bytes and keep the coefficients on halfword boundaries is to take the E2 byte and store it in reverse order at the end of the packed block buffer after the EOB and any padding bytes. In this case, the size of the block itself does not increase and the additional E2 bytes will equal the number of times S>8 occurred as shown in
It should be appreciated that the use of the above format of
The packed JPEG structure can optionally store the R/S symbol as an S/R symbol with the R and S nibbles interchanged. The R (run) can and does have any value from 0 to 15 while the S may be limited in its range depending on the Q-values used. Also the S symbol rapidly decreases in its likelihood of occurrence as the size increases so caching may be improved with the opposite order. The entropy decoder can simply generate the reversed order if this variation is desired.
Exemplary pseudocode to test for S>8 in the Huffman tables is:
Inside Huffman marker code processing subroutine when pointing to R/S bytes should know the sum of the 16 Li terms which is the number of R/S bytes for that table.
For all Huffman tables have a flag,
The remainder of the code can know from this flag that S>8 is impossible and paths can then be followed which never test for such a condition.
In view of the foregoing, it is seen that the intermediate data format in accordance with the invention provides for reduced numbers of memory calls by allowing word or half word accesses and much reduced processing while synchronization is maintained. On average, the memory accesses for any given image will be reduced by a factor of about two, generally allowing time for other processing that may be desired such as image rotations and the like without requiring more processing power than is currently economically feasible. Additionally, the improved packed block structure is compatible with and will provide similar advantages with at least the MPEG-1, MPEG-2, H.261 and H.263 video standard which all use 8×8 blocks of a single component.
While the invention has been described in terms of a single preferred embodiment, those skilled in the art will recognize that the invention can be practiced with modification within the spirit and scope of the appended claims.
Claims
1-6. (canceled)
7. A data format for a block of encoded data including
- a first pair of bytes representing to a decoder a block number, a Klast value and at least one flag indicating if all said coefficient values in said block are coded in eight bits or fewer or if any requires more than eight bits to be uniquely coded,
- a second pair of bytes respectively representing to said decoder an R/S value and a coefficient value.
8. A data format as recited in claim 7, further including
- at least one additional pair of bytes including a EOB byte and a padding byte.
9. A data format as recited in claim 7, wherein said first pair of bytes further includes
- another flag indicating if any runs of consecutive zero-valued coefficients greater than sixteen are present in said block.
10. A data format as recited in claim 7, wherein said Klast value provides an index of a last non-zero coefficient value in a block.
Type: Application
Filed: Jan 26, 2006
Publication Date: Aug 17, 2006
Inventors: Nenad Rijavec (Longmont, CO), Joan Mitchell (Longmont, CO)
Application Number: 11/339,796
International Classification: G06K 9/36 (20060101);