Image Compression and JPEG:This section gives a brief overview of lossy image compression and the JPEG standard. The reader who is already familiar with these concepts is encouraged to skip to the next section.Lossy signal compression works on the basis of transmitting the "important" signal content, while omitting other parts (Quantization). To perform this quantization effectively, a linear de-correlating transform is applied to the signal prior to quantization. All existing image and video coding standards use this approach. The most commonly used transform is the Discrete Cosine Transform (DCT) used in JPEG, MPEG-1, MPEG-2, H.261 and H.263 and its descendants. For a detailed discussion of the theory behind quantization and justification of the usage of linear transforms, see reference [1] below. A brief overview of JPEG compression is as follows. The JPEG encoder partitions the image into 8x8 blocks of pixels. To each of these blocks it applies a 2-dimensional DCT. The transform matrix is normalized (element-wise) by a 8x8 quantization matrix and then rounded to the nearest integer. This operation is equivalent to applying different uniform quantizers to different frequency bands of the image. The high-frequency image content can be quantized more coarsely than the low-frequency content, due to two factors: (1) Since there is a smaller amount of energy packed in the high-frequency bands of most natural images, better rate-distortion tradeoffs are possible by suppressing the low-energy bands and putting available bitrates in the high-energy lowpass bands, a concept known as "water filling" in quantization literature. This is an argument that is based purely on mean-squared error analysis. (2) The human visual system is less sensitive to quantization errors in the high-frequency bands. The quantized DCT coefficients form a matrix that is usually sparse, i.e., there are many zeros in it, especially in the high frequency bands. The elements of the matrix are ordered in a zig-zag scan. Then, an entropy coder combined with a run-length coding of the zeros generates an efficient representation of the quantized coefficients (in terms of bitrate) to be transmitted or stored. A simplified block diagram of a JPEG encoder is shown in the figure below.
Transform coding produces image artifacts. Block transforms, e.g. the DCT, produce blocking artifacts. Furthermore, quantization with any non-redundant linear transform (including DCT and wavelets) produces ringing artifacts to varying degrees. The following figure shows the blocking and ringing artifacts introduced by JPEG.
The subject of this research is to salvage some of the quality lost by image compression through post-processing operations. While JPEG post-processing has been tried previously in a number of ways, the method proposed in the following has the advantage of conceptual simplicity, as well as superior results.
JPEG Post-Processing through Re-application of JPEGPrevious attempts at post-processing JPEG-compressed images often concentrated on the blocking artifacts (a comprehensive list of references to past research in this area is available in [2]). The past approaches almost always included a form of lowpass filtering, to reduce the blocking effect. The more sophisticated approaches use advanced forms of lowpass filtering, for example wavelet filtering, projection on convex sets, or adaptive space-varying filtering. However, there remains the difficulty that lowpass filtering, while reducing artifacts, also reduces the quality of the image. We offer a radically different approach to the problem: we contend that both the blocking and ringing artifacts arise from the cyclostationarity of the transform quantization operation. In other words, the transform quantization operation makes changes in the image that are location dependent. The most apparent manifestation of this location dependence, of course, are the blocking artifacts, which happen at regular 8x8 pixel intervals. However, the ringing artifacts can also be explained in terms of cyclostationarity. Therefore, a fundamental attempt to removing the artifacts should address the underlying reason for them. This research attempts to do so by removing as much of the cyclo-statioarity as possible. To demonstrate how we accomplish this task, refer to the figure below.
This figure blows up a typical JPEG image block. The blocking artifacts are seen at the edges of these blocks. Now assume that a shifted version of the JPEG compression algorithm is applied to the image, such that the blocks are not aligned with the original JPEG. In the figure above, three such shifts are shown. In this case, the overall image will remain the same, however, the blocking artifacts will shift their location (as well as their value). By applying all different possible shifts (in this case 64 of them), one can obtain 64 different versions of the compressed image, each with a different set of artifacts. By averaging all these versions, the effect of artifacts are averaged and dispersed, thus resulting in a higher quality image. The following figure contains the block diagram of our JPEG denoising algorithm.
Implementation Issues:Our algorithm, as shown in the figure above, has the advantage that no new software or hardware is required for its implementation. All one needs is the JPEG encoder and decoder to quickly make some experiments. However, the computational complexity of the algorithm can be significantly reduced (by a factor of more than 100) if we further consider the innards of the structure. The first step in reducing the complexity of the algorithm is to remove the lossless parts of JPEG. Since each JPEG encoder in the algorithm is followed immediately by a decoder, these lossless parts do not provide any real functionality. In particular, zig-zag scan and run-length Huffman coding as well as all the functions related to the bitstream syntax and I/O can be stripped away, resulting in significant computational savings. Essentially, the only parts of JPEG needed are the linear transform (DCT) and quantization. See figure below:
Further simplifications can be achieved by realizing that not all 64 possible shifts are necessary in practice. Using an embedded quincunx sampling lattice for the shifts, it is possible to capture almost all the functionality of the postprocessor by implementing somewhere between 8 and 16 shifts. Furthermore, note that the zero shift is not needed, because JPEG is idempotent, and we assume that the image has to be decompressed anyway. For a discussion of these issues see [3]. Some of my work in this area is represented in the references below. The results in reference [2] are better than any previously reported post-processing result. This reference also includes a fairly comprehensive bibliography of previous work in this area.
Software Release:The source code of a software implementation of this algorithm for Linux and Solaris is available as a gzip tar file. Please read the file README.txt in this release for compilation and usage instructions.postsrc04.tar.gz (245k) The software implementation of this algorithm, as well as the GUI development, was done by Nikhil Hegde, a graduate student at MCL.
Acknowledgments:This work was supported in part by the grant CCR-9985171 from the National Science Foundation, and in part by a grant from the TxTEC consortium.
References:
Last modified February 2002 Back to MCL Main Page |