Fast preview processing for JPEG compressed images

- Xerox Corporation

A method of fast decompressing a document image compressed using transform coding for scaling and previewing purposes. A fast algorithm is derived by utilizing a fraction of all available transform coefficients representing the image. The method is particularly efficient using the discrete cosine transform which is used in the JPEG ADCT algorithm. In JPEG ADCT, a very fast and efficient implementation is derived for a resolution reduction factor of 16 to 1 (4 to 1 in each direction) without needing any floating point arithmetic operations.

Skip to: Description  ·  Claims  ·  References Cited  · Patent History  ·  Patent History
Description

The present invention is directed to a method of decompressing for previewing or scaling purposes, images compressed in accordance with the currently proposed JPEG ADCT (adaptive discrete cosine transform) standard, and more particularly, a method of providing good quality preview images, without extensive decompression processing.

BACKGROUND OF THE INVENTION

Data compression is required in data handling processes, where too much data is present for practical applications using the data. Commonly, compression is used in communication links, where the time to transmit is long, or where bandwidth is limited. Another use for compression is in data storage, where the amount of media space on which the data is stored can be substantially reduced with compression. A device showing either or both of these cases is a digital copier where an intermediate storage is used for collation, reprint or any other digital copier functions. Additionally, digital copiers often allow the printing of externally received data. Generally speaking, scanned images and print masters, i.e., electronic representations of hard copy documents, are commonly large, and thus are desirable candidates for compression.

ADCT (Adaptive Discrete Cosine Transform, described for example, by W. H. Chen and C. H. Smith, in "Adaptive Coding of Monochrome and Color Images", IEEE Trans. Comm., Vol. COM-25, pp. 1285-1292, November 1977), as the method disseminated by the JPEG committee will be called in this application, is a lossy system which reduces data redundancies based on pixel to pixel correlations. Generally, in images, on a pixel to pixel basis, an image does not change very much. An image therefore has what is known as "natural spatial correlation". In natural scenes, correlation is generalized, but not exact. Noise makes each pixel somewhat different from its neighbors.

Generally, as shown in FIG. 1, a JPEG ADCT compression and decompression system is illustrated. A more complete discussion may be had by referencing U.S. Pat. No. 5,321,522 to Eschbach and U.S. Pat. No. 5,359,676 to Fan. Initially provided is tile memory 10 storing an M.times.M portion of the image. From the portion of the image stored in tile memory, the discrete cosine transform (DCT), a frequency space representation of the image is formed at transformer 12. Hardware implementations are available, such as the C-Cube Microsystems CL550A JPEG image compression processor, which operates in either the compression or the decompression mode according to the proposed JPEG standard. As will be described below, the primary focus for implementation of the invention is in software processing, although hardware implementations may be improved, as well. A divisor/quantization device 14 is used, from a set of values referred to as a Q-Table, stored in a Q table memory 16, so that a distinct Q table value is divided into the DCT value, returning the integer portion of the value as the quantized DCT value. A statistical encoder 20 often using Huffman codes is used to endcode the quantized DCT values to generate the compressed image that is output for storage, transmission, etc.

ADCT transforms are well known, and hardware exists to perform the transform on image data, e.g., U.S. Pat. No. 5,049,991 to Niihara, U.S. Pat. No. 5,001,559 to Gonzales et at., and U.S. Pat. No. 4,999,705 to Puri. The primary thrust of these particular patents, however, is moving picture images, and not document images.

To decompress the now-compressed image, and with reference to FIG. 1, a series of functions or steps are followed to reverse the process described. Huffman encoding is removed at decoder 50. The image signal now represents the quantized DCT coefficients, which are multiplied at signal multiplier 52 by the Q table values in memory 54 in a process inverse to the compression process. At inverse transformer 56, the inverse transform of the discrete cosine transform is derived, and the output image in the spatial domain is stored at image buffer 58.

In U.S. Pat. No. 5,321,522 and U.S. Pat. No. 5,379,122 to Eschbach, U.S. Pat. No. 5,359,676 to Fan, the standard process described in FIG. 1 is varied. The original image is compressed; the compressed representation is decompressed. The decompressed image is additionally filtered to improve appearance, but in doing so, it may be forced outside the range of images that are possibly derived from the original image. The DCT representation of the image is therefore altered, in order to force the image into the acceptable range of images. The processes may be used iteratively.

Documents are commonly scanned at a resolution that is much higher than standard monitor resolutions. A simple example is a common, inexpensive 300 spi (spots per inch) desktop scanner and a 75 spi monitor. If the document is brought into a document viewer or editor for preview, selection or annotation, the input document has to be reduced in resolution to give the correct representation on the screen. For uncompressed images, this is a straightforward operation, but current and future document systems make use of image compression to improve storage efficiency. In such a case, the document is conventionally decompressed and the resultant image is undersampled. This is a time-consuming operation, since a large amount of data is first generated and then discarded in the downsampling process.

Suppose one has an image scanned at 300 spi and stored in a compressed form using the JPEG process. Suppose it is desired to preview this image, for example, on a 75 spi monitor, at a comparable size. Thus, in theory, one has to decompress the image, filter it to remove possible aliasing in the subsequent subsampling, and subsample it by a factor of 4 in each direction. Such method is clearly very expensive because the image has to be fully decompressed and all the data in the filtering is done in the higher resolution. In prior methods all the data is processed and only 1/16th of the processed data is used. The proposed method only processes 1/16th of the data and uses all of the processed data.

All of the references cited herein above are incorporated by reference for their teachings.

SUMMARY OF THE INVENTION

In accordance with the invention, there is provided an improved method of decompressing an image in a scaled format for preview purposes or applications that require a reduction in image size (e.g. a thumbnail), using a simplified processing method.

In accordance with one aspect of the invention, there is provided a method of decompressing an image, compressed with a frequency space transform operation, for preview purposes, while approximating an accurate reproduction thereof, including the steps of: receiving from a transmission line a set of frequency space transform coefficients, representing a compressed image block of image signals; selecting a subset of said coefficients containing fewer coefficients that said set thereof, said subset corresponding to coefficients for a predetermined group of low frequency components of said image; recovering an image approximation from said subset of coefficients with a frequency space transform operation. In one embodiment, the forward transform coding operation using the frequency space transform operation is a discrete cosine transform of M channels and the inverse is a discrete cosine transform of N channels (N<M).

In accordance with another aspect of the invention, there is provided a method of decompressing an image, compressed with a frequency space transform operation, which is not the inverse operation to the compression transformation, for preview purposes at 4 to 1 reduction, while approximating an accurate reproduction thereof, including the steps of: receiving from a transmission line, an 8.times.8 set of frequency space transform coefficients, representing a compressed 8.times.8 image block of image signals; selecting a 2.times.2 subset of the coefficients, said subset corresponding to coefficients for a predetermined group of low frequency components of the image; recovering an image approximation from the subset of coefficients, in accordance with the function: ##EQU1## where each x value is a pixel value in the described relative spatial relationship in an set of output image signals; and each y value is a received coefficient value producing an approximation of the image from the coefficients.

The invention provides an alternative method to perform such subsampling in the DCT domain, so that, for JPEG or other coders using 8.times.8-block DCT, only an inverse 2.times.2 transform is necessary. This can be done without multiplications and each block is directly reconstructed at the required resolution.

These and other aspects of the invention will become apparent from the following descriptions to illustrate a preferred embodiment of the invention read in conjunction with the accompanying drawings in which:

FIG. 1 shows a functional block diagram for the prior art ADCT compression/recompression process;

FIG. 2 shows a typical application for the proposed embodiment;

FIG. 3 illustrates the portion of the JPEG ADCT encoded block which is used for the preview decompression process; and

FIG. 4 illustrates a possible circuit configuration for the present invention, and the relationship with the overall processing decompression processing arrangement.

DETAILED DESCRIPTION OF THE INVENTION

Referring now to the drawings where the showings are for the purpose of describing an embodiment of the invention and not for limiting same, initially, the principle of the invention will be discussed.

With reference initially to FIG. 2, it will be appreciated the invention may conveniently be included in a workstation or personal computer generally indicated as 60, operating in accordance with a program implementing the method described herein. Conveniently, compressed images are received at personal computer 60 from a transmission line 62, via modem 64. Image received and decompressed may be reproduced at a display 66 associated with workstation 60; at a printer 68 with or without further processing by the workstation at memory associated with the personal computer; or for re-transmission as an uncompressed image.

With reference to FIG. 3, in JPEG image transmission, frequency components are transmitted in a predefined order, selected so that the most important frequencies are transmitted before less important frequencies. FIG. 3 shows a somewhat standard 8.times.8 block of frequency space coefficients 80 that might be received as compressed image data for decompression. Since in most images the most important frequencies are the lowest frequencies, these frequencies indicated are generally transmitted before others, and are also at determinable locations within the data stream of transmitted data.

Let the description for the DCT coefficients in a transformed block of M.times.M pixels be ##EQU2##

The description of the N.times.N lowest frequency coefficients is: ##EQU3##

The notation for the M.times.M original pixels in the block is: ##EQU4##

The notation for the N.times.N pixels reconstructed after scaling is: ##EQU5##

Let the description of the M.times.M transform matrix (for example the DCT) be given by the matrix Cu and let Du be its inverse. In general we can use C.sub.n and D.sub.n to denote the transform over blocks of n.times.n pixels. The formulation which describes the transformation process over a block of M.times.M pixels and generates M.times.M transformed coefficients is given by: ##EQU6##

Given the coefficients Y.sub.ij, the prior art steps to find the downscaled block of pixels are:

1) obtain the inverse transform using ##EQU7## 2) apply a low-pass filtering operation over the reconstructed image; 3) resample the output to obtain the pixels z.sub.ij.

As can be seen in the above description, all coefficients y are used to generate all output pixels x of which the majority will be discarded in the downsampling.

The proposed method uses a fraction of the coefficients in a block (i.e. N.times.N coefficients out of the original M.times.M coefficients) to directly generate the downscaled reconstructed image. We may also include a renormalization multiplicative constant K as: ##EQU8##

If M=8 and N=2 ##EQU9##

If the elements of D.sub.8 are denoted by {d.sub.ij } we can set K=1 select D.sub.2 as any of the following: ##EQU10##

This will be equivalent to a filtering in transform domain by setting high-frequency coefficients to zero, followed by subsampling by a factor of 4 in each direction. In this example, one reconstruction formula can be: ##EQU11##

The problem with this approach is that matrix D.sub.2 contains floating point numbers and that we are required to perform floating point multiplications. In this particular example, we have ##EQU12##

In the preferred embodiment, we simplify this step by selecting D.sub.2 as the inverse 2-channels DCT. The inverse 2-channels DCT is defined by ##EQU13##

A scaling factor can be selected to incorporate K and the multiplicative factors of D.sub.2. K is designed to compensate the difference of gain between the 8-channel and the 2-channel DCT (K=2/8) so that ##EQU14##

The reconstruction step will require no floating point operations: ##EQU15##

We can define this as the operation of a specific filter in the subsampling such that the operations of inverse DCT, followed by filtering and decimation are substituted by a more simple integer-only inverse transform, plus a scaling factor for compensating the pixels' range. The scaling factor can be applied as a "shift-right" operation to the reconstructed samples, in integer arithmetic. Therefore, the whole process is free of multiplications and can be done by using 8 additions-subtractions and 4 shift operations, for each block of 2.times.2 samples.

For example, if we have the following block of 8.times.8 DCT coefficients: ##EQU16##

The respective 2.times.2 pixel image approximation scan be determined by ##EQU17##

The final pixel intensity values were rounded because the factor 1/8 was computed as an integer division by 8. The values given are pixel gray values in a scale of 0-255, from which an image can be derived. This image can be printed, displayed or stored, or processed to a higher number of pixels for a desired visual effect.

FIG. 3 shows a simplified schematic implementation of the present invention. In one possible embodiment, data for the preview is received at a preview decompression device 100, at memory 102. Certain of the information, designated by position in the transmission stream or by storage in memory 102 is directed to as set of registers 104, 106, 108, and 110. The transformation described above is implemented in a series of successive adders/subtractors, whereby values stored in registers 104 and 108 are added and subtracted, with the results respectively directed to registers 120 and 122. Values stored in registers 106 and 110 are added and subtracted, with the results respectively directed to registers 124 and 126. Subsequently the values in registers 120 and 124 are added and subtracted and stored respectively to shift registers 130, 132. The values in registers 122 and 126 are stored respectively to shift registers 134 and 136. At the shift registers, a 3 bits right shift occurs, which accomplished the divide by 8 requirement. The data values can then be output to the image buffer 58 for reproduction.

If we use the formula involving floating point numbers we would obtain: ##EQU18##

The generalization of the method is to select D.sub.N as the inverse N-channels DCT and to apply the following formula. ##EQU19##

For N>2, floating point numbers may be used. However, always the inherent fast implementation algorithms available for the DCT can be used to fast implement the process, for previewing or scaling purposes.

The embodiment was described in the context of the DCT, however, it should be appreciated that it works for other transforms as well.

The disclosed method may be readily implemented in software. Alternatively, the disclosed image processing system may be implemented partially or fully in hardware using standard logic circuits or specifically on a single chip using VLSI. Whether software or hardware is used to implement the system varies depending on the speed and efficiency requirements of the system and also the particular function and the particular software or hardware systems and the particular microprocessor or microcomputer systems being utilized. The image processing system, however, can be readily developed by those skilled in the applicable arts without undue experimentation from the functional description provided herein together with a general knowledge of the computer arts.

While this invention has been described in conjunction with a preferred embodiment thereof, it is evident that many alternatives, modifications, and variations will be apparent to those skilled in the art. Accordingly, it is intended to embrace all such alternatives, modifications and variations as fall within the spirit and broad scope of the appended claims.

Claims

1. A method of decompressing an image, compressed with a frequency space transform operation, for preview or scaling purposes, while approximating an accurate reproduction thereof, including the steps of:

receiving from a transmission line or retrieving from memory storage devices a set of frequency space transform coefficients, representing a compressed image block of image signals;
selecting a subset of said coefficients including fewer coefficients than said set thereof, said subset corresponding to coefficients for a predetermined group of low frequency components of said image;
recovering an image approximation with a linear transform relating said subset of coefficients to pixels of the said image approximation.

2. A method as described in claim 1, wherein the forward transform coding operation using the frequency space transform operation is a discrete cosine transform.

3. A method of decompressing an image, compressed with a frequency space transform operation, for preview purposes, while approximating an accurate reproduction thereof, including the steps of:

receiving from a transmission line or retrieving from a memory storage device, a M.times.M set of frequency space transform coefficients, representing a compressed M.times.M image block of image signals;
selecting a N.times.N subset of said coefficients, said subset corresponding to coefficients for a predetermined group of low frequency components of said image;
recovering an image approximation from said subset of coefficients, in accordance with the function: ##EQU20## where each z value is a pixel value in the described relative spatial relationship in an set of output image signals; each y value is a received coefficient value; and D.sub.N is an N.times.N inverse transform matrix.

4. A method as described in claim 3, wherein the forward transform coding operation using the frequency space transform operation is a discrete cosine transform of M channels and the inverse transform operation is the discrete cosine transform of N channels.

5. A method of decompressing an image, compressed with a frequency space transform operation, for preview purposes, while approximating an accurate reproduction thereof, including the steps of:

receiving from a transmission line or retrieving from a memory storage device, an 8.times.8 set of frequency space transform coefficients, representing a compressed 8.times.8 image block of image signals;
selecting a 2.times.2 subset of said coefficients, said subset corresponding to coefficients for a predetermined group of low frequency components of said image;
recovering an image approximation from said subset of coefficients, in accordance with the function: ##EQU21## where each z value is a pixel value in the described relative spatial relationship in an set of output image signals; and each y value is a received coefficient value.

6. A method as described in claim 5, wherein the forward transform coding operation using the frequency space transform operation is a discrete cosine transform.

Referenced Cited
U.S. Patent Documents
4999705 March 12, 1991 Puri
5001559 March 19, 1991 Gonzales et al.
5049991 September 17, 1991 Niihara
5321522 June 14, 1994 Eschbach
5359676 October 25, 1994 Fan
5379122 January 3, 1995 Eschbach
5426673 June 20, 1995 Mitra et al.
5485279 January 16, 1996 Yonemitsu et al.
5521841 May 28, 1996 Arman et al.
Other references
  • Chen et al; "Adaptive Coding of Monochrome and Color Images"; IEEE Trans. Comm.; vol. COM-25, No. 11, Nov. 1977; pp. 1285-1292.
Patent History
Patent number: H1684
Type: Grant
Filed: Sep 29, 1995
Date of Patent: Oct 7, 1997
Assignee: Xerox Corporation (Stamford, CT)
Inventors: Ricardo L. de Queiroz (Fairport, NY), Reiner Eschbach (Webster, NY)
Primary Examiner: Bernarr E. Gregory
Attorney: Mark Costello
Application Number: 8/537,056
Classifications
Current U.S. Class: Including Details Of Decompression (382/233); Image Compression Or Coding (382/232); 358/2612
International Classification: H03M 730;