Sonic Boom: System For Reducing The Digital Footprint Of Data Streams Through Lossless Scalable Binary Substitution
Because all digital data streams are composed of randomly-distributed zeros (0s) and ones (1s) called bits, it can be posited that all arbitrary-length binary data sets having a finite magnitude can be distilled into numerically-precise integers that accurately represent the value of every individual bit within the set. Mathematically, once a data stream's bit structure has been analyzed, the exact combination of its uniquely-assembled bits, its “digital footprint”, can be perfectly replicated simply by calculating the numerical value of each consecutive bit to produce a decimal sum equal to the value of the entire stream. This universal data compression technique is called “SCALABLE BINARY SUBSTITUTION” because the functional objective of the scheme is to analyze the digital footprint of a source data stream, regardless of its magnitude, and substitute the entirety of its encoded information for a simple math expression: Absolutely lossless data compression through mathematically-precise substitution.
Latest Patents:
This Non-Provisional Patent Application claims priority to and the benefit of the filing date of Provisional Patent Application No. U.S. 62/495,056, filed on Sep. 1, 2016, entitled “SONIC BOOM: SYSTEM FOR REDUCING THE DIGITAL FOOTPRINT OF DATA STREAMS THROUGH LOSSLESS SCALABLE BINARY SUBSTITUTION”, which is hereby incorporated by reference in its entirety.
BACKGROUND OF THE INVENTION (1) Field of the InventionThis patent specification relates to novel data compression schemes and various algorithmic frameworks for reducing the size of digital source data. More particularly, this non-provisional patent specification describes a universally-scalable platform/protocol/format-agnostic data substitution scheme that is capable of compressing any mode of digitized source data by output ratios that are approximate to the inverse magnitude of the source data being converted while maintaining 100.000% lossless data retention.
(2) Background ArtSince the advent of data compression science, there have been countless technologies, strategies, and novel techniques for compressing data that, regardless of their intended source application, can be categorized as either “lossless” or “lossy” techniques. LOSSY data compression techniques utilize many different strategies to create smaller output files by discarding (losing) a quantifiable amount of material information contained in the original source data. Conversely, LOSSLESS data compression is a class of data compression techniques and specific algorithms that allow all the original source data to be perfectly reconstructed from the compressed data. Lossless compression is used in cases where it is imperative that the original source data and the decompressed output data be identical. Typical examples of source data where maintaining perfect data integrity Would be preferable are software programs, text documents, and other machine-executable source code.
Most forms of lossless compression techniques are based on establishing some manner of statistical model for the input (source) data. This type of modeling is used to map input data to specific binary bit sequences in a manner that identifies the instance of “frequently encountered” data. These binary sequence frequency maps are then used as a template for the construction of a symbolic data substitution scheme so that a smaller amount of output data can be produced to represent the original source material.
- GENERAL PURPOSE examples of lossless statistical modeling algorithms for text or other text-like binary data such as machine-readable executable programs include:
- Run-Length Encoding (RLE)—a simple scheme that compresses data containing numerous instances of the same bit sequences
- Burrows-Wheeler transform—block sorting preprocessing that increases compression efficiency
- Lempel-Ziv/Welch algorithms used by Portable Network Graphics (.PNG) and Graphics Interchange Format (.GIF)
- Statistical Lempel-Ziv algorithm—a combination of statistical method and dictionary/library-based method of data compression
- AUDIO compression algorithms include:
- Free Lossless Audio Codec (FLAC)
- Apple Lossless Audio Codec (ALAC)
- Windows Media Lossless (WMA)
- Audio Lossless Coding Formats (MPEG-4 ALS/SLS)
- GRAPHICS and IMAGE compression algorithms include:
- JPEG-LS (Lossless/near-lossless graphic compression standard)
- JPEG-2000
- Portable Network Graphics (.PNG)
- Tagged Image File Format (.TIFF)
Data compression methods may be categorized according to the type of source data they are designed to compress. While in principle, any “general-purpose” lossless compression algorithm (general purpose meaning that they can accept any bit string) can be used on any type of data, many are unable to achieve significant compression on data that are not of the form for which they were designed to compress. To address this technical limitation, the inventor has innovated a lossless data compression method/algorithmic substitution framework that is specifically designed to be universally-applicable to any type of data that is constructed from binary data sets of arbitrary length. This type of universal data compression system is called SCALABLE BINARY SUBSTITUTION (or “SBS”) because any quantifiable measure of data compression that is produced as a result of its application to source data can be attributed to the function of robust data substitution for which it was primarily designed. The SBS algorithm compresses data by manipulating the properties of binary data sets in a specific manner that is absolutely unique to all other “traditional” data compression methods, be it lossy or lossless, as well as other statistical modeling methods that substitute source data for other symbolic output data from a dictionary, library, index, data tree, or map.
It is well known that there are numerous other algorithms that manipulate the binary properties of source data, but these other systems primarily do so in order to affect the output attributes of the finished product. These types of systems can be categorized by the many algorithms that are used in the technical processes such as Error Control, Error Correcting, and Error Detecting. These algorithms function by using sophisticated and highly-complex mathematical formulae to interpret, process, and convert the wave-like properties of sounds, sample frequencies, and transmission signals into output binary data streams for minute quality improvements. The SBS algorithm offers a significant and unique advantage compared to other algorithms that manipulate binary source bits because it achieves a level of material data compression by applying the fundamental properties of binary-to-decimal arithmetic as the structural basis for its dynamic data substitution function. By utilizing the structural stability of basic mathematical operations, the SBS algorithm is able to effect absolutely lossless data compression by directly calculating the decimal value of source bits relative to their position within arbitrary-length data sets so that their sum can be elegantly substituted for simplified numerical expressions of equivalent value. This brilliant exploitation of binary-to-decimal arithmetic by the SBS algorithm makes it unique in the fact that it transcends the interpretation of binary source bits beyond mere symbols to be manipulated via fundamental statistical substitution, and various techniques of symbolic (bit) compression, correction, deletion, and interpolation.
- Examples of some (“symbolic bit manipulation”) algorithms and techniques are:
- Forward Error Correction (FEC)—used to correct bit-level errors
- Cyclic Redundancy Check (CRC)
- Checksum—used to delete bit-level errors
- Context-based Adaptive Binary Arithmetic Coding (CABAC)
- Motion Picture Experts Group—Fine Granularity Scalability (MPEG-4 FGS)
- High-Performance Low-Complexity Bit-Plane Coding Scheme for MPEG-4 FGS
- Variable-Length Coding (VLC)
- Run-Length Coding
- Bit Framing
- Bit Stuffing
Two highly-specific examples of algorithm schemes that share functional similarities to the SBS algorithm are Context-Based Adaptive Binary Arithmetic Coding (CABAC), and High-Performance Low-Complexity Bit-Plane Coding Scheme for MPEG-4 FGS. These two well-known algorithms have been specifically chosen as technical counter-examples because, like SBS, they both utilize basic mathematical operations and numerical calculations to accomplish their intended functions. These other two algorithms interpret the individual bits of source data sets by calculating their binary values, but, most notably, their algorithmic processes are materially divergent because the numerical sums of those bits are represented in a purely binary format. Precisely, while these other two algorithms use very basic binary-to-binary arithmetic, only SBS exploits binary-to-decimal arithmetic to interpret the informational value of source data sets as well as exclusively representing the calculated numerical sum of those source bits in decimal format. The greatest advantage of the SBS algorithm's use of decimal-based integers to represent the calculated sum of source bits is that only decimals allow extremely large magnitudes (i.e., many consecutive digits) to be more easily substituted for elegant and compactly-written mathematical expressions. If any such sum-derived decimal expression were solely confined to binary symbols, it would be extremely inefficient to machine-interpret or even accurately represent these bits into any viable alternative symbolic substitution medium. While there may be other material functional similarities between these two algorithm counter-examples and SBS, the principal distinction clarified herein is the single most technically noteworthy divergence among their collective binary arithmetic utilization methodologies. It is this proprietary functional characteristic of the SBS algorithm scheme that presents a materially unique, intellectually defensible, and technically patentable innovation.
BRIEF SUMMARY OF THE INVENTIONIn this non-provisional patent submission, we present an innovative data substitution scheme called SCALABLE BINARY SUBSTITUTION (or “SBS”) whose core function is to calculate and convert the numerical value of binary source bits into their equivalent decimal sum that can then be substituted for a numerically-precise, compactly-written, mathematical expression. It offers a significant advantage compared to other algorithms that manipulate the binary properties of data sets because the SBS algorithm functions by interpreting the value of source bits beyond mere symbols to be manipulated via traditional compression and statistical substitution techniques. Because the very nature of mathematics is absolutely limitless in its breadth and scope, the SBS algorithm is exclusively designed to leverage the deceptively-simple, yet infinitely-dynamic rules of binary-to-decimal arithmetic in order to transcend the entire paradigm in which digital data streams are used to encode information. Specifically, once the structure of a data stream has been analyzed, the exact combination of its uniquely-assembled bits, its “digital footprint”, can be perfectly replicated simply by calculating the numerical value of each consecutive bit relative to its given position to produce a decimal sum equal to the value of the entire stream. By exploiting the immutable properties of arithmetic to replicate and flawlessly compress the digital footprint of data streams, SBS presents a new methodology that redefines the fundamental logic of how binary data sets are interpreted to represent machine-readable data. In this non-provisional patent submission, we will clearly demonstrate the conceptual viability of the SBS scheme to realize a structurally-sound and perfectly-reversible technique of facilitating universally-scalable data compression that maintains 100.000% perfect lossless data retention.
The SBS scheme presents a mathematically-irrefutable means of efficiently managing vast amounts of binary data with levels of scalability, precision, and portability that can be useful to the academic, technological, and global commercial channels. The technological paradigm shift towards advanced network virtualization, software defined high-speed networks, and artificially-smart quantitative/predictive analytics necessitates the need for enhanced tools to enable the seamless integration and secure synchronization of disparate platforms. Furthermore, the mass proliferation of internet-connected appliances of every flavor has created sub-optimal levels of global network throughput congestion that has inadvertently exposed critical design flaws in the internet's fabric unforeseen by the architects of the world's digital communication infrastructure. The primary cathexis of SBS is to present a universally-scalable algorithmic framework for compressing and storing binary data sets beyond all previous magnitudes to facilitate next-generation digital signal propagation and ultra-high-speed frictionless communication across digital domains of the future. The proprietary concepts outlined in this patent submission are meant to vividly articulate the fundamental logic behind the scalable binary substitution (SBS) strategy that will help the digital world accomplish its most ambitious and important technological goals.
Some embodiments and mathematical formulae of the present invention are illustrated as examples of the working process by which the SBS system functions:
FIG. [1.0]: Sample binary string of bits illustrating their numerical (decimal) value in relation to their respective bit positions within the string.
FIG. [2.0]: “The SBS (Scalable Binary Substitution) Algorithm.”
FIG. [3.0]: Sample 10-byte (80-bit) source data set.
FIG. [3.1]: Mathematical calculation of 10-byte (80-bit) source data set as shown in [FIG. 3.0].
FIG. [3.2.1]: Sample 8-byte (64-bit) source data set.
In the modern digital world, millions and billions of source bits are assembled to create most commonly used data sets like software programs, multimedia files, games, and digital communication signals. To increase the utility of digital data, there have been many innovations in the art of data compression that are based upon as many different strategies, frameworks, and methodologies as there are hardware and software systems that utilize such data. Most data compression techniques are based upon condensing source data by deleting a material amount of information or by substituting source data for an alternative symbolic representation.
Compressing data streams by calculating the value of its consecutive bits produces sums that can often be millions of digits in length. This is because according to the mathematical nature of adding the individual bits of an arbitrary-length binary data set, the numerical value of any given bit in a stream is exactly double the magnitude of the bit that directly precedes its position, and exactly one-half the magnitude of the bit that follows it.
To illustrate this point, the bits of a sample data set are examined in [FIG. 1.0] beginning with a random bit that is found in the Nth position of a hypothetical data stream. Additionally, the numerical value that is assigned to this Nth bit is thirty-two (32). When the individual bits shown in [FIG. 1.0] are calculated by directly adding the numerical value of each successive bit (or, conversely, subtracting the value of each bit from the mean deviation of the highest-order bit's value), their combined sum is 2,106 which, in the decimal system, requires 4 digits to represent that specific magnitude value.
Computers perform mathematical calculations by combining the logical operations performed by its logic gates to compute the necessary additions, subtractions, multiplications, etc., and arrive at a precise answer. The sequence of logical operations used to perform a particular calculation or specific predetermined functions are called algorithms. If computational resources are not a concern, calculating the numerical value of the assembled bits in a source data set and representing the combined sum in whole decimal value is trivial from an algorithmic perspective. Successively adding a data stream's bits that are initialized to zero (0) followed by the non-negative integer one (+1) up to N (if any) will compute {0, 1, 1+N . . . N}, provided that the necessary computing functions do not exceed the limits of the available CPU hardware and the output decimal representation fits into an allocated memory source.
To explain how this process would apply to a real world paradigm, we will examine one of the most commonly encountered binary data sets of the modern computer age: the digital music file. Given that the average 4-minute music file (.MP3 song, for instance) is approximately 4.0 Megabytes (“MB”) in size, this means that there are 4,194,304 bytes in the file. A byte is defined as a unit of computer information or extensible data storage capacity that consists of a discrete group of 8 bits and that is used especially to represent an alphanumeric character (i.e.: letters, numbers, symbols, etc.). Because a byte is made up of 8 bits, this means that a 4.0 MB music file contains 33,554,432 individually-assembled bits. When these 33 million bits are consecutively added together, this will mathematically produce an equivalent decimal sum approximately 10 million digits long.
In the realm of computer science, when these metrics are considered in terms of data compression, consecutively adding a data stream's bits in order to calculate the numerical value of the entire stream does not, in itself, produce any compression of the original size of the stream. Statistically speaking, a zero net compression ratio (1:1) is produced as a result of this basic process. In fact, in certain instances, negative compression ratios can result from converting binary values into their equivalent decimal values. The fundamental logic of the SBS scheme is to realize superior and absolutely lossless levels of compression by using dynamic mathematical utilities to express a data stream's combined decimal sum in its most elegant, precise, and highly-abbreviated form. By using robust math tools such as square and cube roots, high-powered exponentials, factorials, and other algebraic and calculus functions, the information contained within entire data streams, indeed oceans of data, can be flawlessly substituted for extremely compact and mathematically-precise expressions called “Kinetic Data Primers”, (or “KDPs”). A KDP is, essentially, a basic set of mathematical instructions that, upon algorithmic calculation, is designed to yield precise decimal sums that can be easily converted into a linear sequence of equivalent-value binary bits.
To illustrate how calculating a data set's bits can produce extremely large decimal numbers, and how such numbers can be simply expressed as mathematically-perfect KDPs, the following illustration is a graphical interpretation of a relatively small 64-byte data set. For perspective, given that the size of a common text message (i.e.: a “tweet” on the Twitter service) is limited to 140 characters, which would require 140 bytes of uncompressed data to represent those characters, 64 bytes is roughly half that size.
A 64-character text message produces a binary data set of this size:
- 1110101111010000111010111100001111010100010110011001101100 01000101001101011010110101010110101010110101011001101011101 01010101111011010101011010111110111001010011110111100101111 01101011101011000111010001010001101011010101010101011001101 01010101010101010111101010101011111101010101010010101010101 10011011101011010010110101101010101010101011001101010010101 01111010101010010110101010010111110110101010101010101010101 01010101010010101010101001010111110101101010101011101010101 0010101010101111111000001010101001010111
- when the numerical value of these bits are consecutively added together, they produce the decimal sum:
- 13, 407, 807, 929, 942, 597, 099, 574, 024, 998, 205, 846, 127, 479, 365, 820, 592, 393, 377, 723, 561, 443, 721, 764, 030, 073, 546, 976, 801, 874, 298, 166, 903, 427, 690, 031, 858, 186, 486, 050, 853, 753, 882, 811, 946, 569, 946, 433, 649, 006, 084, 096 [1]
[1] This specific integer represents the precise decimal sum produced by successively adding all 512 bits of a 64-byte binary data set, providing, of course, that each bit in the set yielded its maximum possible numerical value relative to its position within the set (i.e.: if every bit in the data set were calculated as binary ones (1s)). Demonstrating the functionality of the SBS-KDP methodology by reducing a 155-digit integer into a numerically-equivalent (exponentially-powered) 5-character KDP is used herein only to show the maximum mathematically-achievable algorithmic efficiency of the SBS scheme by exploiting the structural stability of binary-to-decimal arithmetic to manipulate binary source data sets in proprietary ways.
- which, in SBS format, can be precisely expressed as a Kinetic Data Primer as elegant and compact as:
- “2̂512”
- (Two-to-the-Five Hundred and twelfth-power).
In the case of the 4.0 MB music file mentioned herein, the 10 million-digit-long decimal number that is produced by successively adding its 33 million source bits can be profoundly reduced by expressing its numerical sum in a more elegant, yet mathematically-precise way. For example, the numerical value of a 10 million-digit-long decimal number can be accurately expressed as a KDP as compactly-written as:
-
- “1560000̂1560000”
- (One million five hundred and sixty-thousand-to-the-One million five hundred sixty-thousandth-power).
When a Kinetic Data Primer of this magnitude is calculated, it will produce a decimal sum approximately 10 million digits in length. This 10 million-digit-long decimal number can then, in turn, be converted back into its precise binary equivalent which, in the methodology of the SBS substitution scheme, would serve to perfectly reconstruct the digital footprint (i.e.: bit type and exact position) of all 33 million bits in the original 4.0 MB source data set.
The ultimate utility of the SBS scheme can be found in the sheer economy of data used to substitute the exact numerical value of astronomically-large source-calculated sums: Encoding an arbitrary mathematical expression such as “1560000̂1560000” into a machine-readable format would only require 50 bits of data (less than 7 bytes).[2] In general terms of data compression, encoding the binary information contained in a 4,194,304-byte (4.0 MB) source file into an SBS-KDP as infinitesimally compact as seven (7) bytes would mathematically indicate a baseline output compression ratio of 599,186:1, which is the net compression yield of 4,194,304 bytes reduced to 7 bytes (0.007 KB). For technical perspective, the current state-of-the-art in commercial-grade audio media compression techniques only produce average output compression ratios of less than 100:1. [2] The Kinetic Data Primer size variable of seven (7) bytes represents the 50 bits of data needed to encode the mathematical expression “1560000̂1.560000” into its KDP format. These 7 KDP bytes consist of the 21 bits of data needed to represent both the base decimal magnitude of (1,560,000) and its exponential power magnitude of (1,560,0001,560,000) plus the 8 bits of data needed to represent the ASCII symbol (̂) used to signify a base number's exponential value. The 50 bits of data needed to express the KDP “1560000̂1560000”, for example, can be encoded within 7 bytes because, at 8-bits-per-byte, the maximum data capacity of 7 bytes is 56 bits. This 7-byte KDP size variable excludes any proprietary KDP file data including, for instance, any SBS-KDP file ID, KDP codec decimal library markers, alphanumeric hash tags (MD5, etc.), IP security/encryption codes, forensic authentication data (DMCA, etc.), KDP mantissa-correction codes, and any other dynamic KDP payload data. When these extrinsic SBS-KDP file data are embedded into a KDP in its perfect format, this could increase the KDP's output size from its 7-byte “Quantum Footprint” to a maximum scalable payload capacity of 32 bytes (0.03 KB). When a KDP is scaled to its maximum payload size format of 32 bytes, this will necessarily decrease its output compression ratio from 599,186:1 to 131,072:1, which is the net compression yield of 4,194,304 bytes reduced to 7 bytes (0.007 KB) and 32 bytes (0.03 KB), respectively.
2. The SBS AlgorithmThe specific functions of the SBS algorithm scheme can be explained in its most simplified form in the following 5 steps shown in [FIG. 2.0].
3. The Proposed SBS Algorithm SchemeTo illustrate the (source bits-to-kinetic data primer) substitution methodology of the SBS algorithm scheme, the following example of an actual binary source data set is examined in [FIG. 3.0]. The SDS shown in [FIG. 3.0] contains 80 bits. Eighty bits (at 8 bits-per-byte) equals 10 bytes. Because a bit can only exist in two states, a zero (0) or a one (1), for the purposes of demonstrating the functionality of the SBS algorithm, the bits in the SDS have been randomly arranged. The numerical value of any given bit in a data set will always be determined by its type (i.e.: 0 or 1) and its exact position within the set. When calculating the numerical value of consecutive bits in any finite-length data set, it is important to note that only binary 1s (“1-bits”) will produce any numerical value and their equivalent decimal values will be determined by their exact position within the set. Conversely, if any bit in a finite-length data set is a binary 0 (“0-bit”), it will not produce any numerical value and, therefore, its equivalent decimal value will always be zero (0) regardless of its position within the data set. Additionally, since the numerical value of the first bit (bit-1) of any finite-length data set will always be initialized to zero (0), it will only produce a corresponding decimal value of one (+1) if it is a 1-bit. All subsequent bits in the data set, if any, will produce a corresponding decimal value exactly double (2×) the value of the bit that directly precedes its position. The potential decimal value of the bits in any finite-length data set will be determined as follows:
-
- {1, 1x2,1x4, 1x8, 1x16, 1x32, 1x64, 1x128, 1x512, 1x1,024, 1x2,048, . . . N}
To illustrate in detail how the bits in the [FIG. 3.0] SDS were calculated, a list of all 80 source bits and their equivalent decimal values are listed in [FIG. 3.1]. Consecutively adding each bit in the [FIG. 3.1] SDS produces a decimal sum of 8.140274939e22, which has 23 digits in the output number. When this sum is expressed as a whole number, its precise decimal value is: “81,402,749,386,839,761,113,321.” To realize a material level of data compression, this 23-digit decimal sum can be synthesized into an alternate mathematical expression such as 12111 (or “One hundred twenty-one-to-the-eleventh-power”). This alternate numerical expression can then be coded into a machine-readable KDP written as: “121̂11”
The data needed to encode the mathematical expression “121̂11” is only 24 bits (3 bytes). Specifically, the decimal values (121) and (11) can each be encoded within two 8-bit groups because, in the binary system, the total range of decimal values that can be represented in each group is 0 through 255. The ASCII symbol (̂) can also be encoded using 1 byte of data.
In addition to the methodology of successively adding the value of individual source bits into a combined output decimal sum, the dynamic functionality of the SBS supersubstitution framework also allows the digital footprint of any arbitrary-length binary data set to be perfectly replicated by calculating the sum of its source bits independently and exclusively by their spatial values derived from their exact positions within the set. Accordingly: For an 8-bit data set composed of all binary ones (“1-bits”), the maximum spatial-bit value that can be obtained is 36. This decimal value is calculated by successively adding the base value of each individual bit respective of its positional value within the set, or {1+2+3+4+5+6+7+8}=36. Conversely, for material clarification, if an 8-bit data set were composed of seven binary zeros (“0-bits”) and a single binary one (“1-bit”) found in the 8th-bit position, the total calculated spatial-bit value of the set would be 8. Calculating the collective bit positions of individually-interpreted source bits as a collective spatially-oriented data set presents an alternately viable method of expressing a data set's combined numerical sum in a more economical fashion. This form of spatially-oriented and vector-based calculation method necessarily involves an expanded algorithmic process to identify and correct numerical redundancies and to produce a perfect output KDP with a collateral mantissa, if any, as compact as the primary SDS-to-KDP method detailed herein.
3.2 SBS Algorithm Scheme with a Multi-Kinetic Data Primer Number of a Source Data SetThe Source Data Set shown in [FIG. 3.0] demonstrates the methodology in which the bits of an SDS can be calculated into an equivalent decimal value and further synthesized into an alternate numerical expression which, in the final stage of the SBS scheme, is used as the input data for a source's KDP. In the above SDS-to-KDP demonstration, the decimal sum that resulted from calculating the SDS's bits was precise enough to be synthesized into a single exponential expression of (12111) without any collateral decimal remainder. Because there are an infinite number of equivalent numerical values that can be calculated from the analysis of binary data sets, it is a mathematical certainty that not every sum will be without any collateral decimal remainder resulting from such calculation. Therefore, in the following SDS-to-KDP demonstration, we will show how an SDS with an “imperfect” decimal sum can be synthesized into a “perfect” KDP using multiple primers. This SDS is shown in [FIG. 3.2.1].
When the individual bits shown in [FIG. 3.2.1] are consecutively added together, the decimal sum that is produced is: “2,432,902,008,176,640,000.” When this decimal sum is initially calculated to determine if it can be synthesized into a “neat” high-powered exponential expression of equivalent value, or, in other words, an expression without any collateral decimal remainder, it is found to be numerically “imperfect.” Whenever an imperfect source sum is produced, the simplest method of calculating its most-approximate base primer is to subtract a binary magnitude variable that is found to be the closest numerical approximation to the output decimal sum of the SDS. In other words, since the output sum of the SDS is (2.432902008e18), the closest equivalent decimal value that can be expressed as a binary magnitude variable would be (261), which, when calculated, produces a decimal value of (2.305843009e18). In order to calculate the next viable (2nd-order) sub-primer, the numerical disparity between the SDS sum and the newly-obtained base primer value must first be ascertained. When these two numbers are calculated by subtracting the base primer value from the sum of the SDS, the remaining decimal value is (1.27058999e17). When this decimal remainder is calculated to determine whether it can be synthesized into a “neat” equivalent expression, its most-approximate equivalent sum is found to produce a mantissa (collateral decimals to the right of a logarithm).
Whenever any sub-primer is found to have a mantissa, the simplest method of determining whether it can be used as a viable output sub-primer, the closest square/cube root of the number is calculated to find the most-approximate non-negative integer with the smallest mantissa (i.e., the lowest number of collateral decimals). In the case of the decimal remainder (1.27058999e17), the most viable sub-primer variable is found by calculating its first cube root (3√), which produces a decimal value of (502,730.3947). This sub-primer output variable of (502,730.39473) can be used as a viable 2nd-order KDP number, because, when it is calculated into its whole decimal form and compared for accuracy against its source variable, it doesn't produce any collateral decimals. Therefore, the two KDP numbers that can be integrated to produce a perfect output KDP number are detailed as follows:
In the final analysis, the perfect multi-variable output KDP number is:
-
- “2̂61+502730.3947̂3”
When this multi-variable output KDP number is calculated into a single whole number, it produces a decimal value of (2,432,902,008,176,640,000), which is precisely equivalent to the calculated decimal sum of the SDS. The data needed to encode the mathematical expression “2̂61+502730.3947^3” as a perfect KDP number is 73 bits (less than 10 bytes). These 73 bits consist of:
The 73 total bits of data needed to express the above perfect KDP can be encoded within 10 bytes because, at 8-bits-per-byte, the maximum data capacity of 10 bytes is 80 bits. In terms of data compression, encoding the binary information contained in an 8-byte SDS into a 10-byte multi-variable KDP number would mathematically indicate a negative net output compression ratio of 0.80:1, which is the net compression yield of 8 bytes increased to 10 bytes (0.0097 KB).
This particular example of a multi-primer KDP is being included herein to demonstrate that it is, in fact, mathematically-possible to produce a negative net compression yield from the application of the SBS scheme to an arbitrary-length SDS.
Although it is highly unlikely that an SDS as small as 8 bytes would have any viable human utility beyond machine-readable-only command prompts and predetermined programming functions, an 8-byte SDS was specifically chosen because it approximates the algorithmic/substitution threshold limit that determines whether a positive or negative output compression yield is produced by the application of the SBS scheme. It is important to emphasize the fact that, as prior algorithm examples demonstrate, the SBS scheme uses multi-input data fields to encode an SDS into an output KDP whose range of unique numerical input data are virtually limitless. Whenever the application of the SBS scheme produces a negative net compression yield, it is mathematically-possible to synthesize other multi-primer alternative variables that can produce more precise decimal sums which, upon further calculation, can have a material effect on whether the final KDP synthesis yields a positive or negative net compression ratio.
4. Experimental Results And DiscussionsThe algorithm structure of the SBS-KDP scheme uses dual binary input data fields to encode up to 64 bits (8 bytes) of scalable KDP source information per field. The precise range of numerical values that be encoded within each 64-bit “number field” is 0 through 18,446,744,073,709,551,615 (18 Quintillion, or 264−1), which is used to represent the corresponding range of decimal values produced by calculating the bits of an SDS. The two number fields are functionally partitioned by a third input data “character field” used to represent dynamic mathematical functions such as exponential-powers (x̂), square and cube roots (x√), factorials (x!), or any other math operation (+, −, ÷, x, etc.), for instance.
When both input number fields are coded to represent the maximum decimal value of their 64-bit data capacities used in tandem with the input character field to express a dynamic mathematical operation, a high-powered exponential value, for example, the combined tri-field input would be:
- “18446744073709551615̂18446744073709551615”
The data needed to represent this specific maximum-value KDP number is only 136 bits (17 bytes), whereas the amount of source data that can be encoded is 2.3 sextillion bytes (2.3 Zettabytes, or “ZB”) with 100.000% lossless data retention efficiency. If no other extrinsic SBS-KDP file data are needed to produce a perfect KDP source number, then these 17-byte-scheme metrics would mathematically indicate an output compression ratio of 138 EB:0.017 KB, which is the net compression yield of a 2.3 ZB SDS reduced to 17 bytes (0.017 KB).[3]
[3] As previously explained, whenever any extrinsic SBS-KDP file data are embedded into a perfect KDP source number, the output size of the KDP could increase from its 17-byte “Quantum Footprint” to its maximum scalable payload capacity of 32 bytes (0.032 KB). Including any such extrinsic KDP file data would necessarily decrease the output compression ratio from 139 EB:0.017 KB to 73 EB:0.031 KB, which is the net compression yield of a 2.3 ZB SDS reduced to 17 bytes and 32 bytes, respectively.
In this non-provisional patent submission, the inventor hereby makes the following CLAIM(S) to substantiate, support, and corroborate the uniquely defensible nature of the preceding Summary of the Invention entitled: “SONIC BOOM: System for Reducing the Digital Footprint of Data Streams Through Lossless Scalable Binary Substitution.”
Claims
1. The invention claimed functions as a data substitution system by applying binary-to-decimal arithmetic to directly calculate the decimal value of each consecutive bit in any arbitrary-length binary source data set in order to produce an output decimal sum whose precise numerical value is substituted for the entire string of binary source bits.
2. A data substitution system as claimed in claim 1, in which the individual bits of a binary source data set are interpreted and/or directly calculated in a manner that produces an output decimal sum independently and exclusively respective of each bit's spatial values derived from their exact positions within the set.
3. The invention claimed achieves a level of material data compression by converting the output decimal sum of the consecutively added bits of a binary source data set as produced in claim 1, or 2, into an interchangeable mathematical expression of equivalent numerical value specifically encoded, prearranged, or designed to produce a materially reduced magnitude.
Type: Application
Filed: Aug 7, 2017
Publication Date: May 24, 2018
Applicant: (NEW MARKET, MD)
Inventor: ANTHONY BEN BENAVIDES (NEW MARKET, MD)
Application Number: 15/731,813