This is the second assignment from my Masters Applied Digital Information Theory Course which has never been published anywhere and I, as the author and copyright holder, license this assignment customized CC-BY-SA where anyone can share, copy, republish, and sell on condition to state my name as the author and notify that the original and open version available here.
On the previous assignment we generate a chaotic sequence based on skew tent maps, and this time we generate a binary sequence (either '0' or '1'). On this occasion we generate the chaotic binary sequence based on the threshold function.
bin = 0 if (x < c)
bin = 1 if (x ≥ c)
When a sequence value 'x' is less than critical point 'c' the binary value will be '0' and when a sequence value 'x' is equal or greater to critical point 'c' the binary value will be '1'. For example the critical value is set as 'c' = 0.4 and chaotic sequence 'x' = [0.750000 0.416667 0.972222 0.046296 0.115741 0.289352 0.723380], the binary sequence will be 'bin' = [1 1 1 0 0 0 1] as Figure 1.
2. Probability of binary sequence
Using the chaotic sequence based on skew tent map to generate binary sequence we compute the probability the sequence value will be '0' or '1' as 'P(0) = c' and 'P(1) = 1-c'. Using the skew tent map we can compute 'P(00) = c x c', 'P(01) = c(1-c)', 'P(10) = (1-c)c', and 'P(11) = (1-c)(1-c)'. On Markov's chain the probability that '0' will remain '0' is 'P(0|0) = c', '0' will become '1' is 'P(1|0) = 1-c', and so-on 'P(0|1) = c', 'P(1|1) = 1-c'.
We continue the assignment using the previous assignment to generate the binary sequence and calculate the probability of '0' and '1' theoretically and the actual number of '0' and '1' on the binary sequence generated. We used the initial value 'x(1) = 0.3' and trials of 'c = 0.1, 0.2, 0.301, 0.4, 0.6, 0.7, 0.8, 0.9'. The total sequence generated is 2000000 values (2 Millions). The result is on Table 1.
Table 1. Result of theoritical calculation and actual
While the theoretical probability was calculated based on the formula, we count how much '0's and '1's in the sequence consist of 2 Millions binary values. For example using 'c = 0.4' we got 799594 '0's and 1200406 '1's in the 2 Millions sequence. So the actual probability of '0' is '799594/2000000 = 0.399797' and '1' is '1200406/2000000 = 0.600203'. For '00's is 319635, '01's is 479958, '10' is 479958, and '11' is 720449. Everything else is on Table 1, and actual data matched the theory. For Entropy and compression ratio is discussed on the next section.
3. Entropy vs Compression Ratio
On this binary sequence we can calculate the entropy which is how much important information contained. The higher the entropy (more important information) the lower the compression ratio (the least it can be compressed). Based on Table 1 we also calculated entropy of the sequence with the following formula.
Entropy = -P(1)log2P(1) - P(0)log2P(0)
On this experiment the generated 2 million binary sequence is saved into a '.dat' file. The raw file for all 'c' value will equal to 2MB. Here we chose to compress the '.dat' file into '.tar.bz2' format which is a common compression method in Linux. Finally we can plot the entropy to the compression ratio on Figure 2 which is as stated in the theory that the compression value increases with lower entropy.
4. When t ≠ c is not memoryless
When generating the binary sequence uses the threshold function bin = 0 (x < t) @ 1 ( x ≥ t), on above section it's automatically adjust that t = c to produce a memoryless binary sequence. One of the ways to show that the binary sequence is memoryless is that it fulfills Markov's chain on Figure 3, other than through the Bernoulli trials and independent and identically distributed (iid). We see that on table 2 that the probability is not balance, the probability of turning and remains '0' should be the same, same goes for the probability of turning to 1.
Table 2. Data of t ≠ c
Using the chaotic sequence generated by skew tent map we can generate the random binary sequence with the probability of '0's and '1's computable. The result above shows that the theoretical probability of '0's, '1's, '00's, '01's, '10's, '11's, and the Markov's chain where the probability of '0' will remain '0', will turn to '1', '1' will remain '1', or turn to '0' matches from the actual binary sequence generated. The theory is true to the actual. On the second case where we compare the entropy to capability of the compression, the greater the entropy the less the compression ratio. We found that the sequence is not memoryless when t ≠ c because it does not fulfill the Markov's Chain.
6. Source Code
The script is created in Ruby language, use Ruby to run.