Data compression via alphabet partitioning and group partitioning

A data compression method, system and program code are provided which optimize entropy-coding by reducing complexity through alphabet partitioning, and then employing sample-group partitioning in order to maximize data compression on groups of source numbers. The approach is to employ probabilities to renumber source numbers so that smaller numbers correspond to more probable source numbers. This is followed by group partitioning of the stream of resultant numbers into at least two groups, for example, defining regions of an image, ranges of time, or a linked list of data elements related by spatial, temporal or spatio-temporal dependence. A maximum number (N.sub.m) is found in a group of numbers of the at least two groups and then entropy-coded. A recursive entropy encoding of the numbers of the group is then employed using the maximum number N.sub.m. The process is repeated for each partitioned group. Decoding of the resultant codewords involves the inverse process. Transformation and/or quantization may all be employed in combination with the group-partitioning entropy encoding.

Skip to:  ·  Claims  ·  References Cited  · Patent History  ·  Patent History

Claims

1. A method for compressing data, said method comprising:

(a) employing probabilities to renumber a plurality of integer numbers representative of the data so that smaller numbers correspond to more probable integer numbers of said plurality of integer numbers and outputting a stream of numbers based upon said renumbering;
(b) grouping said stream of numbers into at least two groups;
(c) finding a maximum number (N.sub.m) in a group from the at least two groups;
(d) entropy-coding the maximum number N.sub.m;
(e) recursively encoding the numbers of the group using the maximum number N.sub.m; and
(f) repeating said steps (c)-(e) for each group of said at least two groups, thereby providing compressed data.

2. The method of claim 1, wherein said recursively encoding step (e) comprises:

(i) partitioning the group of numbers into nonempty subgroups of numbers, and creating a subordinate group (S-group) list containing said subgroups;
(ii) determining a maximum number in a subgroup;
(iii) entropy-coding the maximum number in the subgroup with an (N.sub.m +1) size alphabet; and
(iv) repeating said steps (i)-(iii) for each subgroup in the S-group list.

3. The method of claim 2, wherein said repeating step (iv) includes determining whether the maximum number in the subgroup is zero and if so, selecting a next subgroup from the S-group list, and if the maximum number in the subgroup is greater than zero, determining whether the subgroup has only one constituent and if so, selecting a next subgroup from the S-group list.

4. The method of claim 1, further comprising alphabet partitioning the plurality of integer numbers into set numbers and set indices, and separating said set numbers and said set indices into two data streams, and wherein said renumber step (a) comprises employing probabilities to renumber said set numbers for encoding in steps (b)-(f) and outputting as codewords.

5. The method of claim 4, further comprising encoding said set indices in binary form, and wherein said method is in combination with a method for processing said codewords, said processing method comprising transmitting or storing said codewords and said set indices, and thereafter decoding said codewords by employing group-partitioning entropy-decoding.

6. The method of claim 5, wherein said group-partitioning entropy-decoding comprises receiving the codewords and entropy-decoding a maximum number N.sub.m of a group of codewords, then decoding the numbers in the group using the maximum number N.sub.m.

7. The method of claim 6, further comprising recovering the plurality of integer numbers subsequent to decoding of the codewords by undoing the renumbering thereof of step (a).

8. The method of claim 7, further comprising subsequent to said undoing the renumbering thereof, recombining the set numbers and the set indices into a single data stream.

9. The method of claim 4, further comprising performing at least one of transformation and quantization on the plurality of integer numbers prior to said alphabet partitioning.

10. The method of claim 1, further comprising performing at least one of transformation and quantization on the plurality of integer numbers prior to said renumbering step (a).

11. The method of claim 1, wherein steps (a)-(f) produce codewords comprising said encoded plurality of integer numbers, and said method is further in combination with a method for processing said codewords, said method for processing said codewords comprising transmitting or storing said codewords, and thereafter decoding the codewords by employing group-partitioning entropy-decoding.

12. The method of claim 11, wherein said group-partitioning entropy-decoding comprises receiving the codewords and entropy-decoding a maximum number N.sub.m of a group of codewords, then decoding the numbers in the group using the maximum number N.sub.m.

13. The method of claim 12, further comprising recovering the plurality of integers subsequent to decoding of the codewords by undoing the renumber thereof of step (a).

14. The method of claim 1, wherein said grouping step (b) comprises partitioning the stream of numbers into at least two groups of numbers defining regions of an image, ranges of time, or a linked list of data elements related by spatial, temporal or spatio-temporal dependence.

15. A method for compressing data, said method comprising:

(a) grouping a plurality of integer numbers representative of the data into at least two groups;
(b) establishing a group list containing said at least two groups;
(c) identifying a maximum integer number (N.sub.m) within the at least two groups and entropy coding said maximum integer number (N.sub.m);
(d) if N.sub.m >0, then for each group in the group list
(i) divide the group into n subgroups, wherein n.gtoreq.2;
(ii) create a binary mask of n bits, each bit said corresponding to one subgroup of said n subgroups and comprising 1 if the corresponding subgroup's maximum integer number (N.sub.ms) equals the maximum integer number (N.sub.m), otherwise the bit is 0;
(iii) entropy coding the binary mask and entropy coding every subgroup maximum integer number (N.sub.ms) which is less than the maximum integer number (N.sub.m);
(iv) add to the group list each subgroup with more than one integer number and a subgroup maximum integer number (N.sub.ms)>0; and
(e) repeating said steps (c) & (d) for each group and subgroup in the group list until the group list is empty, thereby achieving encoding of said plurality of integer numbers and compressing of said data.

16. The method of claim 15, wherein said plurality of integer numbers comprise a plurality of set numbers, and wherein said method further comprises obtaining said plurality of set numbers by alphabet partitioning a stream of source numbers into alphabet set numbers and alphabet set indexes, said alphabet set numbers comprising said set numbers to be encoded.

17. The method of claim 16, further comprising ordering said plurality of set numbers according to assigned probabilities, wherein an alphabet set number "0" indexes a most common alphabet set, a "1" indexes a second most common alphabet set, etcetera.

18. The method of claim 15, wherein said grouping step (a) comprises partitioning the plurality of integer numbers into at least two groups defining regions of an image, ranges of time, or a linked list of data elements related by spatial, temporal or spatio-temporal dependence.

19. The method of claim 15, wherein said entropy coding (d) (iii) includes aggregating said subgroup maximum integer numbers (N.sub.ms) to create an extension and entropy-coding said extension.

20. The method of claim 15, wherein said entropy-codings of steps (c) & (d) produce codewords, and wherein said method is further in combination with a method for processing said codewords, said processing method comprising transmitting or storing said codewords, and thereafter decoding said codewords by employing group-partitioning entropy-decoding.

21. The method of claim 20, wherein said group-partitioning entropy-decoding comprises receiving the codewords and entropy-decoding a maximum number N.sub.m of a group of codewords, then decoding the integer numbers in the group using the maximum number N.sub.m.

22. The method of claim 21, wherein said method further comprises obtaining said plurality of integer numbers by employing probabilities to renumber a plurality of source numbers so that smaller integer numbers correspond to more probable source numbers.

23. An encoder for encoding a plurality of integer numbers comprising:

(a) means for renumbering the plurality of integer numbers so that smaller numbers correspond to more probable integer numbers, and for outputting a stream of numbers based thereon;
(b) means for grouping the stream of numbers into at least two groups;
(c) means for finding a maximum number N.sub.m in a group from the at least two groups; (d) means for entropy-coding the maximum number N.sub.m;
(e) means for recursively encoding the group of numbers using the maximum number N.sub.m; and
(f) means for repeating, for each group of the at least two groups, said means for finding (c), said means for entropy-coding (d) and said means for recursively encoding (e), thereby producing codewords comprising said encoded plurality of integer numbers.

24. The encoder of claim 23, wherein said means for recursively encoding (e) comprises:

(i) means for partitioning the group of numbers into nonempty subgroups of numbers, and for creating a subordinate group (S-group) list containing said subgroups;
(ii) means for determining a maximum number (N.sub.m) in a subgroup from the S-group list;
(iii) means for entropy-coding the maximum number in the subgroup with an (N.sub.m +1) size alphabet; and
(iv) means for repeating processing of said means for partitioning (i), said means for determining (ii), and said means for entropy-coding (iii) for each subgroup in the S-group list.

25. The encoder of claim 24, wherein said means for repeating said means for partitioning (i), said means for determining (ii), and said means for entropy-coding (iii) comprises means for selecting a next subgroup in the S-group list if the maximum number in a current subgroup is zero or if the current subgroup has only one constituent.

26. The encoder of claim 23, further comprising means for alphabet partitioning the plurality of integer numbers into set numbers and set indices, and for separating said set numbers and said set indices into two data streams, wherein said means for renumbering (a) comprises means for employing probabilities to renumber said set numbers.

27. The encoder of claim 23, further comprising means for performing at least one of transformation and quantization on the plurality of integer numbers before renumbering by said means for renumbering.

28. The encoder of claim 23, wherein said means for grouping comprises means for partitioning a stream of numbers into at least two groups of numbers defining regions of an image, ranges of time, or a linked list of data elements related by spatial, temporal or spatio-temporal dependence.

29. The encoder of claim 23, wherein said means for recursively encoding (e) comprises:

(i) means for partitioning the group of numbers into n nonempty subgroups of numbers, and for creating a subordinate group (S-group) list containing said subgroups;
(ii) means for employing a binary mask of n bits, each bit corresponding to one subgroup of said n subgroups and comprising 1 if the corresponding subgroup's maximum integer number (N.sub.ms) equals the maximum integer number (N.sub.m), otherwise the bit is 0;
(iii) means for entropy-coding the binary mask and entropy-coding every subgroup maximum integer number (N.sub.ms) which is less than the maximum integer number (N.sub.m); and
(iv) means for adding to the group list each subgroup with more than one integer number in a subgroup maximum integer number (N.sub.ms)>0.

30. A decoder for decoding codewords produced by the encoder of claim 23, said decoder comprising:

means for receiving said codewords;
means for entropy-decoding a maximum number N.sub.m of a group of numbers represented by the codewords;
means for decoding integer numbers in the group using the maximum number N.sub.m; and
means for repeating for each group of numbers represented by the codewords, said entropy-decoding of a maximum number of the group and said decoding of the integer numbers in the group using the maximum number N.sub.m.

31. The decoder of claim 30, further comprising means for recovering the plurality of integer numbers subsequent to decoding of said codewords by undoing the renumbering thereof of said encoder.

32. The decoder of claim 31, wherein said codewords represent encoded set numbers and said means for receiving further comprises means for receiving a separate stream of set indices, and wherein said decoder further comprises means for recombining decoded set numbers and said set indices into a single data stream.

33. The decoder of claim 32, wherein said received codewords comprise encoded transformed data, and wherein said decoder further comprises means for inverse transforming said single data stream to obtain the plurality of integer numbers.

34. A computer system for processing source integer numbers comprising:

an encoder for encoding said source integer numbers to produce codewords;
a transmission/storage subsystem for accommodating said codewords;
a decoder for receiving said codewords from said transmission/storage subsystem and for decoding said codewords to reobtain said source integer numbers; and
(i) means for renumbering the source numbers so that smaller numbers correspond to more probable source numbers and for outputting a stream of numbers based thereon;
(ii) partition means for grouping said stream of numbers into at least two groups;
(iii) means for finding a maximum number (N.sub.m) in a group of numbers from the at least two groups;
(iv) means for entropy-coding the maximum number N.sub.m;
(v) means for recursively encoding the group of numbers using the maximum number N.sub.m; and
(vi) means for repeating said means (iii)-(v).

35. The computer system of claim 34, wherein said means for recursively encoding (v) comprises:

(vii) means for partitioning the group of numbers into nonempty subgroups of numbers, and for creating a subordinate group (S-group) list containing said subgroups;
(viii) means for determining a maximum number in a subgroup;
(ix) means for entropy-coding the maximum number in the subgroup with an (N.sub.m +1) size alphabet; and
(x) means for repeating said means for partitioning the group into nonempty subgroups (vii), said means for determining (viii), and said means for entropy-coding (ix) for each subgroup in the S-group list.

36. The computer system of claim 35, further comprising means for alphabet partitioning the source integer numbers into set numbers and set indices, and for separating said set numbers and said set indices into two data streams, and wherein said means (i)-(vi) comprise means for encoding said set numbers to produce said codewords.

37. The computer of claim 35, wherein said decoder comprises means for group-partitioning entropy-decoding a maximum number N.sub.m of a group of codewords, and then for decoding the numbers in the group using the maximum number N.sub.m.

38. The computer system of claim 37, wherein said decoder further comprises means for recovering the source integer numbers subsequent to decoding of the codewords, said means for recovering comprising means for undoing the renumbering of the source integer numbers by said means for renumbering (i) of said encoder.

39. A computer program product comprising a computer usable medium having computer readable program code means therein for use in compressing data in a computer system, said computer readable program code means in said computer program product comprising:

(i) computer readable program code means for renumbering a plurality of integer numbers representative of the data employing associated probabilities so that smaller numbers correspond to more probable integer numbers of said plurality of integer numbers and for outputting a stream of numbers based upon said renumbering;
(ii) computer readable code means for grouping said stream of numbers into at least two groups;
(iii) computer readable code means for finding a maximum number (N.sub.m) in a group from the at least two groups;
(iv) computer readable code means for entropy-coding the maximum number N.sub.m;
(v) computer readable program code means for recursively encoding the numbers of the group using the maximum number N.sub.m; and
(vi) computer readable program code means for repeating processings of said computer readable code means (i)-(iv) for each group of the at least two groups.

40. The computer readable program code means of claim 39, further comprising computer readable program code means for alphabet partitioning the plurality of integer numbers into set numbers and set indices, and for separating said set numbers and said set indices into two data streams, and wherein said computer readable program code means (i) comprises computer readable code means for renumbering the set numbers for encoding by said computer readable program code means (ii)-(vi).

41. The computer readable program code means of claim 39, wherein said computer readable program code means (ii) comprises computer readable program code means for partitioning the stream of numbers into at least two groups of numbers defining regions of an image, ranges of time, or a linked list of data elements related by spatial, temporal or spatio-temporal dependence.

Referenced Cited
U.S. Patent Documents
4075622 February 21, 1978 Lawrence et al.
4821119 April 11, 1989 Gharavi
5049881 September 17, 1991 Gibson et al.
5150209 September 22, 1992 Baker et al.
5271071 December 14, 1993 Waite
5377018 December 27, 1994 Rafferty
5535290 July 9, 1996 Allen
5539467 July 23, 1996 Song et al.
Foreign Patent Documents
2 293 734 April 1996 GBX
WO 95/14350 May 1995 WOX
Other references
  • G.K. Wallace, "The JPEG Still Picture Compression Standard", Comm. ACM, vol. 34, pp. 30-44, Apr., 1991. S. Todd, G.G. Langdon, Jr., J. Rissanen, "Parameter reduction and context selection for compression of gray-scale images", IBM J. Res. Develop., vol. 29, No. 2, pp. 188-193, Mar. 1985. S.D. Stearns, "Arithmetic Coding in Lossless Waveform Compression", IEEE Trans. Signal Processing, vol. 43, No. 8, pp. 1874-1879, Aug. 1995. A. Said, W.A. Pearlman, "Reversible image compression via multiresolution representation and predictive coding", Proc. SPIE, vol. 2094: Visual Commun. and Image Processing, pp. 664-674, Nov. 1993. I.H. Witten, R.M. Neal, J.G. Cleary, "Arithmetic Coding For Data Compression", Commun. ACM, vol. 30, No. 6, pp. 520-540, Jun. 1987. W.B. Pennebaker, J.L. Mitchell, G.G. Langdon, Jr., R.B. Arps, "An overview of the basic principles of the Q-Coder adaptive binary arithmetic coder", IBM J. Res. Develop., vol. 32, No. 6, pp. 717-726, Nov. 1988. A. Said, W.A. Pearlman, "Reduced-Complexity Waveform Coding via Alphabet Partitioning", IEEE Int. Symposium on Information Theory, Whistler, B.C., Canada, p. 373, Sep. 1995. A. Said, W.A. Pearlman, "A New, Fast, and Efficient Image Codec Based on Set Partitioning in Hierarchical Trees", IEEE Trans. on Circuits and Systems for Video Technology, vol. 6, No. 3, pp. 243-250, Jun. 1996. A. Said, W.A. Pearlman, "An Image Multiresolution Representation for Lossless and Lossy Compression", IEEE Trans. on Image Processing, vol. 5, No. 9, pp. 1303-1310, Sep. 1996.
Patent History
Patent number: 5959560
Type: Grant
Filed: Feb 7, 1997
Date of Patent: Sep 28, 1999
Inventors: Amir Said (Cupertino, CA), William A. Pearlman (Niskayuna, NY)
Primary Examiner: Brian Young
Law Firm: Heslin & Rothenberg, P.C.
Application Number: 8/796,961
Classifications
Current U.S. Class: To Or From Code Based On Probability (341/107)
International Classification: H03M 700;