Diffusion Characteristics for Simultaneous Source Coding & Encryption [Encompression] Rohit Pandharkar

1. Introduction Claude Shannon, in his classical paper on ‘Communication Theory of Secrecy Systems’[1], treated the concepts of Information theory and encryption both, and delineated the areas where the information theoretic concepts intersect with those of encryption. In this paper, Shannon introduced the concepts of Confusion and Diffusion. However, it is the concept of Diffusion which is of our interest for encompression. Shannon’s Concept of Diffusion: In diffusion, the statistical structure of M (message plaintext) which leads to its redundancy is “dissipated” into long range statistics—i.e., into statistical structure involving long combinations of letters in the cryptogram. The effect here is that the enemy must intercept a tremendous amount of material to tie down this structure, since the structure is evident only in blocks of very small individual probability [Refer Appendix 1] William Stallings [2] refers to diffusion as the phenomenon of flipping of multitude of bits in the output stream generated from a key, resulting in deviation from the original source coded signal, or from encrypted signal by making smaller changes in the key. The definition by Stallings is broader in the sense that it allows us to look at diffusion from number of bits flipped (which is easier to compute) rather than looking at redundancy dissipation. Chung-E Wang [3][4] has worked extensively in the area of Cryptography in data Compression, and has delved into the possibilities of exploiting the freedom in the source coding algorithms for the sake of encryption. Combining Source Coding and Encryption: Encompression Having considered the relations established by Shannon between source coding and encryption, and looking at the possibility of exploiting the freedom in the source coding algorithms, we work towards a Comprehensive Theory of Simultaneous Source coding and Encryption.

2. Algorithm for ENCOMPRESSION & results for diffusion While looking at Encompression, an important characteristic of secrecy systems that interests us is the “Diffusion Measure”. As Source coding algorithms define the rules of ‘defining characters/source symbols in entirety’, a small change in the rules of defining symbols can reflect in a large change in the output text. For example: Let us consider a crude concept of coding in which we represent letters a-to-z by numbers 1-to-26. Let us define a rule for encryption as Ci= mi+1 mod 26: However, if the rule is changed to Cj= i=1Σ i=s m j+i. mod 26

Almost all the output bits will be changed. So, we conclude that a small change in the RULES OF CODING can reflect in large change in statistics of output symbols. Based on this, we claim that “Diffusion characteristics of cipher text resulting out of Simultaneous Source Coding and Encryption are better than those of cipher text generated from stepwise source coding and then encryption”. Now, we consider Huffman coding for moving towards initial results supporting our claim. We define our methods as Method 1: Stepwise, Huffman coding (normal) and then encrypting with key K Method 2: Simultaneous Huffman Coding and encryption by, shuffling the Huffman tree using the same key K. Here, we compare diffusion of bits in two cases 1) Bitwise Deviation of the source coded & encrypted sequence from the original source-codedonly sequence using method 1 and method 2 2) Bitwise deviation of the source-coded-and-encrypted sequence using key K1 and Key K1’ (by changing few key bits) by each method separately.

CASE 1: Consider the following tree.

For ENCOMPRESSION, we use a key of length equal to number of parent nodes.

The essential idea of concealing information in the process of a Huffman coding is to use an encryption key to shuffle the Huffman tree before the encoding process. Without the encryption key, the Huffman tree cannot be shuffled in the same way and thus the decompression cannot be done properly. Consequently, the original information cannot be revealed. To shuffle a Huffman tree, the interior nodes i.e. nodes with 2 children, are first numbered. There are many ways of numbering these interior nodes. For example, by performing a queue traversal on the Huffman tree, the interior nodes can be numbered in the top-down, left-right fashion. In Fig. 1, the labeling of the interior nodes shows the top-down, left-right numbering of the interior nodes.

Afterward bits of the encryption key are associated with the interior nodes according to the numbering; interior node 1 is associated with the first bit of the encryption key, interior node 2 is associated with the second bit of the encryption key, etc. Finally, of each interior node that has a corresponding encryption bit of 1, the left child is swapped with the right child. In Fig. 2, the encryption key used is “101101”. Thus, the two children of interior nodes 1, 3, 4, and 6 are swapped. After the shuffling, the codeword of permissible characters are changed dramatically and cannot be decoded without an identically shuffled Huffman tree. Note: It should be observed that this algorithm does not alter any of the normal characteristics of Huffman coding namely: Compression ratio, average code word length and entropies. We are changing the branching of nodes at the same level of priority (i.e. a code word of k-bit s for a letter is changed into another code word of k-bits only by shuffling the Huffman tree).Hence, this is in fact a case of a differently labeled Huffman tree, by using the freedom of labeling the 1s and 0s in forming the tree. It has been shown already that Huffman tree is not unique [5], however yields the same results of compression ratios and average code word length.

Now consider a 20 letter message text: gdabbcfgdecfgbcdefcd By method 1: Step 1: Source coding using Unshuffled Huffman tree: The source coded sequence using Fig 1 is: 1100 100 01 1101 1101 101 111 1100 100 00 101 111 1100 1101 101 100 00 111 101 100 [SEQ A]

Step 2: Encryption using key K. Now we use the Key K= 101101 repetitively and XOR it with our source code sequence Source coded : 1100 100 01 1101 1101 101 111 1100 100 00 101 111 1100 1101 101 100 00 111 101 100 Key : 1011 011 01 1011 0110 110 110 1101 101 10 110 110 1101 1011 011 011 01 101 101 101 Final sequence: 0111 111 00 0110 1011 011 001 0001 001 10 011 001 0001 0110 110 111 01 010 000 001 [SEQB]

Deviation (SEQ B ~ SEQ A) of Coded-Encrypted seq. from Source-coded-only sequence = 44 bits/63 bits Method 2: Simultaneous Source coding and encryption using Shuffled Huffman tree from fig 2. Original message: g d a b b c f g d e c f g b c d e f c d Final sequence: 0001 011 11 0000 0000 010 001 0001 011 10 010 001 0001 0000 010 011 10 001 101 011 [SEQ C]

Deviation (SEQ C ~ SEQ A) of Coded-Encrypted seq. from Source-coded-only sequence = 49 bits/63 bits Comparison: Method 2 has 11.3% better diffusion as compared to method 1 Method 1 gives 0.698 fractional deviations, and method 2 gives 0.777 fractional deviations.

CASE 2: Bitwise deviation of the source-coded-and-encrypted sequence using key K1 and Key K1’ (by changing few key bits) by each method separately.

Let us change only one (first) bit of the key and observe how many bits are changed at output because of this change. New key K1’ =001101 (Previous key was: 101101) Method 1: As the key was repetitively used 11 times, the first bit will appear 11 times and hence the output changes in 11 bits New Sequence Final sequence: 1111 110 00 0111 1011 001 001 0101 001 00 011 011 0001 1110 111 111 01 110 000 101 [SEQ D]

Deviation (SEQ D ~ SEQ B)= 11 bits/63 bits

Method 2: Here a lot of bits change because changing one key bit changes symbols for all letters

a b c d e f g

Previous 11 0000 010 011 10 001 0001

with K1’ 01 1000 110 111 01 101 1001

Hence, there is a change of 1 bit per letter in original plaintext - for [SEQ E] So total number of changes is: 20 bits (for 20 letter sequence). Hence deviation is 20 bits/63 bits Comparison: Method 1 gives 17.4% Diffusion in CASE 2 Method 2 gives 31.7% Diffusion in CASE 2 Method 2 gives 1.81 times better results than Method 1 in CASE 2.

3 Conclusions: Our claim that Simultaneous Source coding and Compression gives better diffusion for a key as compared to that for Step wise source coding and encryption with the same key has been substantiated in case of Huffman coding algorithm. It indicates that if the source coding rules are changed based on a key for encryption, by exploiting the freedom in source coding algorithms, they can eliminate the overhead of encrypting the source coded sequence separately. Moreover, it shall also be noted that these benefits are achieved without compromising the Entropy, Equivocation and Coding efficiency of the Huffman coding as we are changing the branching of nodes at the same level of priority (i.e. a code word of k-bit s for a letter is changed into another code word of k-bits only by shuffling the Huffman tree). Thus, average code word length, compression-ratio, and average mutual entropies remain the same as that would have been for normal Huffman coding

Appendix 1 Two methods (other than recourse to ideal systems) suggest themselves for frustrating a statistical analysis. These we may call the methods of diffusion and confusion. In the method of diffusion the statistical structure of M (message plaintext) which leads to its redundancy is “dissipated” into long range statistics—i.e.,into statistical structure involving long combinations of letters in the cryptogram. The effect here is that the enemy must intercept a tremendous amount of material to tie down this structure, since the structure is evident only in blocks of very small individual probability. Furthermore, even when he has sufficient material, the analytical work required is much greater since the redundancy has been diffused over a large number of individual statistics.

An example of diffusion of statistics is operating on a message M = m1;m2;m3; _ _ _ with an “averaging” operation, e.g., i=s

yn = i=1Σ mn+i (mod 26); adding s successive letters of the message to get a letter yn. One can show that the redundancy of the y sequence is the same as that of the m sequence, but the structure has been dissipated. Thus the letter frequencies in y will be more nearly equal than in m, the digram frequencies also more nearly equal, etc. Indeed any reversible operation which produces one letter out for each letter in and does not have an infinite “memory” has an output with the same redundancy as the input. The statistics can never be eliminated without compression, but they can be spread out.

References [1]C. Shannon, “Communication theory of secrecy systems." Bell Systems Technical Journal, 28(4), 656-715 (1949). [2] William Stallings, “Cryptography and Network Security Principles and Practices”,. Third Edition, Prentice Hall, 2003. [3] Chung-E Wang, “Cryptography in Data Compression” ,CodeBreakers Journal Vol. 2, No. 3 (2005). [4] Chung-E Wang, “Simultaneous Data Compression and Encryption”, Security and Management 2003: 558-563. [5] John G. Prakis, “Digital Communication”, Mc-Graw Hill Companies, (2007) [6] Mohamed Haleem, K.P. Subbalakshmi and R. Chandramouli, "Joint Encryption and Compression of Correlated Sources'', EURASIP Journal on Information Security (JIS), Special Issue on The Interplay between Compression and Security for Image and Video Communication and Adaptation over Networks, (2007). [7] Chetan Nanjunda Mathur, Karthik Narayan, and K. P. Subbalakshmi, "On the Design of Error Correcting Ciphers-High Diffusion codes," EURASIP Journal on Wireless Communications and Networking, Special Issue on Wireless Network Security, Volume 2006, Article ID 42871, Pages 1–12, DOI 10.1155/WCN/2006/42871.

Diffusion Characteristics for Simultaneous Source ...

coded signal, or from encrypted signal by making smaller changes in the key. .... [5] John G. Prakis, “Digital Communication”, Mc-Graw Hill Companies, (2007).

102KB Sizes 0 Downloads 225 Views

Recommend Documents

Simultaneous Technology Mapping and Placement for Delay ...
The algorithm employs a dynamic programming (DP) technique and runs .... network or the technology decomposed circuit or the mapped netlist is a DAG G(V, ...

Simultaneous Approximations for Adversarial ... - Research at Google
When nodes arrive in an adversarial order, the best competitive ratio ... Email:[email protected]. .... model for combining stochastic and online solutions for.

Relative-Absolute Information for Simultaneous Localization and ...
That is why it is always required to handle the localization and mapping. “simultaneously.” In this paper, we combine different kinds of metric. SLAM techniques to form a new approach called. RASLAM (Relative-Absolute SLAM). The experiment result

LGU_NATIONWIDE SIMULTANEOUS EARTHQUAKE DRILL.pdf ...
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Main menu.

Phenomenal characteristics of autobiographical memories for social ...
Previous studies failed to show clear differences between people with social phobia and non-anxious individuals regarding the specificity and affective intensity of their autobiographical memories for social events. However, these studies did not ass

Extension Communication And Diffusion Of Innovations For ...
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Main menu.

Diffusion Adaptation Strategies for Distributed ... - IEEE Xplore
Oct 9, 2014 - data to estimate some. 1 parameter vector in a distributed manner. There are a ... recovery of sparse vectors to be pursued both recursively and.

Open Source Software for Routing
ISIS (IPv6) (and ISIS IPv4 is not yet useable). • Multiple branches of Quagga: -. Quagga.net (official “Master” branch), Euro-IX, Quagga-RE and more. 17.

Procedure for recording the simultaneous activity of ...
Oct 28, 2008 - microdrive adapted to a commercially available neural data collec- ... the activities of neurons distributed across cortical and ... computer screens (Fig. ..... the neural data processing system by means of local network (Switch ...

Procedure for recording the simultaneous activity of single ... - PNAS
Oct 28, 2008 - The neuronal recordings show good signal-to-noise ratio, are remarkably stable along a 1-day session, and allow testing several protocols. Microelectrodes are removed from the brain after a. 1-day recording session, but are reinserted

Simultaneous Local Motion Planning and Control for ...
ence reported in [15] indicates that quadratic programming methods are ..... International Conference on Robotics and Automation, Kobe, Japan,. May 2009, pp.

Procedure for recording the simultaneous activity of single ... - PNAS
Oct 28, 2008 - This maneuver is made under computer control and continuous impedance testing of each microelectrode (Fig. 2). Once all microelectrodes are on the top of cortex, the recording session begins by gently lowering (5–20 μm/s) the electr

Visual Simultaneous Localization
Robots and the Support Technologies for Mobile .... “Vision-Based Mobile Robot Localization and Map Building,” .... Conference on Automation Technology.

Simultaneous multidimensional deformation ...
Jul 20, 2011 - whose real part constitutes a moiré interference fringe pattern. Moiré fringes encode information about multiple phases which are extracted by introducing a spatial carrier in one of the object beams and subsequently using a Fourier

Multi-Modal Tensor Face for Simultaneous Super ...
overcome this problem, super-resolution techniques [14, 16,. 18, 17] can be exploited to generate a ... ages and scenes into a Markov network, and learned the parameters of the network from the training data. ...... [19] J. Sun, N. Zhang, H. Tao and

Distributed Online Simultaneous Fault Detection for ...
paper presents a distributed, online, sequential algorithm for detecting ... group of sensors. Each sensor has a ... We call this change point detection. In order for ...

Simultaneous Technology Mapping and Placement for ... - IEEE Xplore
technology mapping, timing-driven placement, and physical. Manuscript received ...... He was with IBM T. J. Watson Research Center,. Yorktown Heights, NY, in ...

Relative-Absolute Map Filter for Simultaneous ...
avoid the error measurement effect, which may lead the large oscillation of gain K .... Implementation,” IEEE Transactions on Robotics and Automation,. Vol.17 ...

MULTI-SOURCE SVM FUSION FOR ENVIRONMENTAL ...
decision trees and support vector machines (SVM), showing that SVM ..... 352-365, 2005. ... Computer Science & Information Engineering, National Taiwan.

Open Source Software for Routing - apnic
Funded by Companies who like an Open Source. Alternative. ‣ Non-Profit Organization. • Part of ISC (Internet System. Consortium). Quick Overview of what we ...

STATISTICAL RESOLUTION LIMIT FOR SOURCE ...
ABSTRACT. In this paper, we derive the Multidimensional Statistical Resolution. Limit (MSRL) to resolve two closely spaced targets using a widely spaced MIMO radar. Toward this end, we perform a hypothesis test formulation using the Generalized Likel

Constant Current Source for Coulometry
Various readings displayed on Mobile Application. Android App Displays: 1. Interval X Voltage, 2. Interval X Current, 3. Interval ... Page 10. BJT Switch On. 10 ...