Department of ECE and † Department of ISE, Coordinated Science Laboratory University of Illinois at Urbana-Champaign, Urbana, Illinois 61801 Email: {ghassam2,cullina,kiyavash}@illinois.edu

Abstract—In a symmetric-key cryptography system, it is often required to transmit a nonuniform message from a very large set. In this case, a computationally unbounded adversary can take advantage of the non-uniformity of the posterior to recover the message. Recently an encryption scheme called Honey Encryption has been proposed to increase the information-theoretic security of the system, i.e., guaranteed level of security regardless of the computational power of the adversary. In this paper, we present a technique called message partitioning which can be used to accomplish the same goal. We analyze the overall security of the combination of this technique with Honey Encryption, which uses a Distribution Transforming Encoder (DTE) block. We propose a new DTE which has an acceptable performance under limited amount of available auxiliary randomness. Achievable bounds are presented for both cases, which under certain conditions, are close to the lower bounds on the level of the success of the adversary. Index Terms—Information Theoretic Security, Symmetric Encryption, Distribution Transformation.

I. I NTRODUCTION Consider a symmetric encryption setup in which message M from a finite set M is to be transmitted securely. Assume a secret key which is drawn uniformly from set K is used to encrypt the message to ciphertext C. There are two trivial attacks available to an adversary who wants to recover the sent message: Guess a random key or guess the most probable mes1 sage. Therefore, P (Recov.) ≥ max{ |K| , PM (m0 )}, where, m0 is assumed to be the message with the largest probability. There are two general cases in which we can restrict the adversary’s success probability to the lower bound above: (1) If the cardinallity of M is larger or equal to the cardinality of K and if the distribution over the set M is uniform, the adversary’s posterior is uniform and the same as the prior 1 1 and hence, P (Recov.) = |K| . (2) If |K|= |M|, since |K| ≤ PM (m0 ), the best guess of the adversary is to choose the message with highest probability and hence, P (Recov.) = PM (m0 ). This case is the well known Vernam’s one-time pad encryption [1]. Shannon, in his seminal paper [2], proved that one-time pad achieves perfect secrecy, i.e., ciphertext reveals no information about the message without knowing the secret key. The security guarantee obtained in the above cases is information-theoretic security, i.e., it holds true irrespective of the computational power available to the adversary [3]. In real world applications, usually neither of the conditions above are satisfied. It is often required to send a nonuniform message and the size of the message set is much larger than

the size of the key set. In this case, the adversary can take advantage of the non-uniformity of the posterior to improve his chance of recovering the message. For example, suppose the message set is the set of all natural english texts containing less that 1000 characters and keys are of length 100 bits. In this case, the adversary can try all the keys and consider the deciphered texts. Since only a small number of the deciphered texts are plausible, the largest posterior on messages could be much larger than 2−100 . This condition could increase the adversary’s probability of success. For some special block ciphers, the probability of message recovery could be roughly as large as the sum of the probability of |M| |K| most likely messages. Therefore, a natural question that arises is: In a setting that messages are non-uniform and |M| |K|, can we still achieve the information-theoretic security guarantee which restricts the 1 adversary’s success probability to |K| ? To solve this problem, a new encryption scheme called Honey Encryption was recently proposed [4]. Some of the applications of this scheme are presented in [5] and [6]. The main idea of this scheme is to first map the messages to a larger set whose elements are called seeds, such that the distribution of the seeds is close to uniform. The seeds will then serve as the input of a standard symmetric encryptor. The motivation behind this scheme is that by uniformizing the distribution over the input of the encryptor, we push 1 the probability of message recovery down to |K| . In [4], the block which does the mapping from messages to the seeds is called Distribution Transforming Encoder (DTE). This block can be added to any standard symmetric encryption system. The authors further present a DTE scheme which may require a very large number of seeds. Because DTE maps each message to multiple seeds, it must be a non-deterministic function, which requires auxiliary local randomness. Note that in order to obtain information-theoretic security guarantees, the auxiliary randomness used for mapping messages to seeds should be truly random and local to the transmitter and hence will be expensive. Furthermore, large number of seeds means the output of the transmitter will be long which may be impractical for some applications. Given the requirement of unlimited perfect randomness is expensive if not impractical, in this paper, we first consider the case that no auxiliary randomness is available at the transmitter. We introduce a message partitioning technique for this

Fig. 1: System model.

case that increases the message protection. Next, we consider the case that external randomness is available but limited. That is, the number of seeds that can be used in the DTE is fixed. We present an upper bound on the performance of the system regardless of the DTE scheme used. Furthermore, we propose a technique for the design of a general purpose DTE, and we consider the overall performance of the combination of our message partitioning and the designed DTE scheme. The designed DTE is optimum in the sense that for a given number of available seeds, it minimizes the total variation distance between the seed’s distribution and uniform distribution. We show that both the message partitioning technique and the message partitioning technique combined with our proposed DTE lead to close to optimum upper bounds on the probability of messages recovery. Our bounds suggest that using only the message partitioning technique without DTE might be more beneficial. However, our results are achievability bounds and converse bounds are required for drawing a certain conclusion. The rest of the paper is organized as follows: We introduce our message partitioning technique in Section II and analyze its performance when no auxiliary randomness is available. In Section III, we consider the case of non-zero but limited amount of auxiliary randomness at the transmitter. A special DTE for this case is proposed and analyzed. Our concluding remarks are presented in Section IV. II. M ESSAGE PARTITIONING T ECHNIQUE Let the finite set M be the set of messages. We sort messages in decreasingly order of their probability and partition them into P ≥ 1 subsets of size N {M0 , M1 , · · · , MP −1 }, where, Mi contains the (i+1)-th N messages from the sorted set, i.e., Mi = {miN , · · · , m(i+1)N −1 }. A secret key from the set K is chosen uniformly at random and is shared between the encryptor and the decryptor. Given a message mj , we define the index of the message in the partition as j mod N and the partition number as b Nj c. The encryptor uses the index and the key to output the ciphertext C, using one-time pad as the cipher and sends the ciphertext and the partition number to the receiver. The receiver uses the shared key to recover the index from the received ciphertext, and then, using the index and the partition number, recovers the original message. We assume the adversary knows the partition number, ciphertext and the used encryption/decryption scheme, but not the secret key. Lemma 1. If the messages are sorted in decreasingly order of their probability and partitioned into P partitions of size

N , we have P −1 X

P (mpN ) ≤ P (m0 ) +

p=0

1 , N

where m0 is the message with largest probability. Proof. Because of the sorting, we have N P (mpN ) ≤ PpN j=(p−1)N +1 P (mj ), for 1 ≤ p ≤ P − 1. Therefore, N

P −1 X

P (mpN ) ≤ N P (m0 )+

p=0

|M| X

P (mi ) ≤ N P (m0 )+1

i=1

It might seem that transmitting the partition number unencrypted is a significant leak of information. Because the adversary knows the partition number, the size of his uncertainty set (set of candidate messages) is |K|. However, any symmetric encryption scheme with a key space of the same size also reduces the adversary’s uncertainty set to that cardinality. The partitioning scheme has the benefit of leaking no additional information via the encrypted index. Theorem 1. Using the message partitioning technique, with partitions of size |K|, we will have P (Recov.) ≤ P (m0 ) +

1 . |K|

(1)

where m0 is the message with largest probability. Proof. Upon observing ciphertext C and partition number p, the best guess of the adversary for the sent message, is to choose the message which maximizes the posterior probability of the message PM |C,p (m|C, p). Therefore: P (m|p)P (C|m, p) ] P (C|p) P (mp|K| ) |Mp | 1 X max P (m|p) = max P (m|p) = . = |K| c m |K| m P (p) P We ≤ p P (p)P (Recov.|p) P note that P (Recov.) = P (m ). Hence, Lemma 1 concludes the desired rep|K| p sult. P (Recov.|p) ≤ EC|p [max m

1 Theorem 1 implies that if P (m0 ) ≤ |K| , then P(Recov.) ≤ 2 1 . Recall that P (Recov.) is lower bounded by |K| . With|K| out the partitioning, it is easy to see that P (Recov.) ≤ Pd |M| e |K| P (mi ). For some block ciphers, this upper bound is i=0 2 . Therefore, without tight and could be much larger than |K| any auxiliary randomness, the partitioning technique can significantly secure the system.

III. C OMBINING M ESSAGE PARTITIONING AND DTE The Honey Encryption scheme aims to get the recovery 1 + for “negligible ” [4]. To do so, the auprobability of |K| thors present a DTE scheme called Inverse Sampling DTE (ISDTE) and bound its performance for some special applications. The requirement of this encoder is that the greatest common divisor of the fractions in the image of the cdf of the message distribution should be larger than 2−l , where, l is the number of bits used for representing the seeds. For some message distributions, this condition requires a very large number of seeds. In this section we present an analysis that allows us to determine the effect of availability of only a limited number of seeds on the recovery probability of a system using DTE. Further, we present and analyze a DTE scheme which for limited amount of available randomness, minimizes the variational distance between the seed distribution and the uniform distribution. Figure 1 depicts the system model with DTE and partitioning blocks. On the transmitter side, a message M is chosen from a finite message set. The partition number of the message is revealed publicly and the index of the message in its partition is given to the DTE block. The DTE encodes the index to a seed S. The encryptor encrypts the seed with a secret key and outputs the ciphertext C. On the receiver side, first, the decryptor recovers the sent seed using the received ciphertext and the shared key and then, the distribution transforming decoder, decodes the seed and recovers the message index. Finally, using this index and the partition number, the message is recovered. The adversary knows the DTE and encryption schemes, but not the key. A. Algorithm As explained in Section II, we sort the messages and divide them into P ≥ 1 partitions, where each partitions contains N messages 1 . Here we assume N ≤ |S|, where S is the set of seeds. We use the same set of seeds across all partitions, thus, for each partition the seed set remains the same, but the DTE varies as we shall see. Suppose message M from partition p is to be transmitted. The partition will be revealed publicly. The message M (or more precisely its index) is mapped to a seed S chosen according to PS|M . In order to do that, first we divide S into subsets {S1 , S2 , · · · , SN } using Algorithm 1. Note that we need to run Algorithm 1 once for each partition and hence the subsets {S1 , S2 , · · · , SN } can be different for each partition. Since all messages should be transmittable each message should be mapped to at least one seed, i.e., |Sm |≥ 1, ∀m ∈ Mp . We denote |Sm | by xm and the vector of xm ’s for messages in Mp by xp . We denote the probability distribution over the set of messages in partition p by PM |p and the induced probability distribution over the set of seeds by PS|p . After getting the 1 This time, unlike in Section II, N is not necessarily fixed to |K| and is a design parameter.

number of seeds assigned to each message m in the partition from Algorithm 1, we will map the message m uniformly onto Sm , that is, PS|m is uniform on Sm . Note that the induced distribution over seeds in each partition is dependent on the probability of the messages in that partition and hence is dependent on the partition number. Algorithm 1 for m ∈ Mp do xm = 1, Ym = PM |p (m)|S|−xm end for B = |S|−N while B > 0 do Find m∗ = arg maxm∈Mp Ym xm ∗ = xm ∗ + 1 Ym∗ = Ym∗ − 1 B =B−1 end while return xp Once the message (in reality its index) is mapped to a seed, the seed will be encrypted to a ciphertext using the secret key K, which is chosen uniformly from a set K. The encryptor can use any kind of block cipher to produce the ciphertext. The adversary knows the sets Mp , S and K and their corresponding distributions, but not the secret key. Through out, we consider |S|= |C|, where C is the set of ciphertexts, thus for each fixed key, the encryption function to be bijective, which makes decryption possible. B. Evaluation The rest of the paper is devoted to the evaluation of the proposed scheme. We emphasize that our results here also includes the case that no partitioning is done, in which case N = |M|. The main result of this evaluation is the following: Theorem 2. If maxm PM |p (m) ≤ P (Recov.) ≤

1 6|K|

−

1 |S| ,

then

1 N + 3|K| 3|K| + + + 3|K|P (m0 ). |K| |S| N

Note that for suitable values of P (m0 ), |S| and appropriately chosen value of N , we can achieve a recovery bound close 1 to |K| . Also, Theorem 2 shows that if we reduce the size of the partitions, N , a fewer number of seeds is required for achieving the same bound. By comparing the bound in Theorem 2 with the one in Theorem 1, we see that the bound in Theorem 1 is always better. This suggests that using only the message partitioning technique without DTE can be more beneficial. In other words, just by implementing the message partitioning technique we might be able to achieve an information-theoretical security guarantee better than the one attained by using DTE. However, since these bounds are derived using achievability arguments, no conclusive deductions could be ascertained without converses. In the following we will prove Theorem 2.

Lemma 2. For the combination of DTE and the partitioning technique, the probability of message recovery given the partition is bounded as follows: X X 1 ]P (c|s) (I) P (Recov.|p) ≤ max [P (s|p) − m∈Mp |S| c s∈Sm X X 1 + max P (c|s) (II) m∈Mp |S| c

P subject to s∈Sm PS|p (s) = pM |p (m), for all m ∈ Mp . We first claim that each message should be mapped uniformly onto Sm . The proof of this claim is omitted due to the space constraint. Therefore X 1 + min [PS|p (s) − ] PS|p |S| s∈S

X PM |p (m) 1 + − ] [ xp xm |S| s∈S X xm + [PM |p (m) − = min ] xp |S| m∈Mp 1 X [xm − PM |p (m)|S|]+ , = min xp |S| = min

s∈Sm

Term (II) represents the performance of the system if the DTE had worked perfectly and produced a uniform distribution over the seeds and term (I) represents the penalty of the true seed distribution being distant form uniform distribution. Proof. Similar to the case with no auxiliary randomness, upon observing ciphertext C, the best guess of the adversary for the sent message is arg maxm PM |C,p (m|c, p). Therefore, we have the following: P (Recov.|p) ≤ EC|p [max P (m|C, p)] X m = max P (m, c|p) m

c

=

X

max m

c

=

X

max

X c

X

m∈Mp 1 |S|

P (s|p)P (m|s, p)P (c|s)

s

max

Adding and subtracting completes the proof.

P (m, s, c|p)

s

m

c

=

X

X

P (s|p)P (c|s).

s∈Sm

and splitting the maximization

In the following, we find upper bounds on term (I) and (II). 1) Upper Bound on term (I): Lemma 3. X X X 1 1 + ]P (c|s) ≤ [PS|p (s) − ] max [P (s|p) − m∈Mp |S| |S| c s∈Sm s∈S Proof. X X 1 ]PC|S (c|s) max [PS|p (s) − m∈Mp |S| c s∈Sm X X 1 + ≤ max [PS|p (s) − ] PC|S (c|s) m∈Mp |S| c s∈Sm XX 1 + ≤ [PS|p (s) − ] PC|S (c|s) |S| c s∈S X 1 + = [PS|p (s) − ] . |S| s∈S

Note that the last expression in the proof is equal to P 1 s∈S |PS|p (s) − |S| |, which is the total variation distance between the distribution PS|p and the uniform distribution on the seeds. Therefore, we are interested in the following problem: X 1 + min [PS|p (s) − ] , (2) PS|p |S| 1 2

s∈S

m∈Mp

where xp is the vector of xm ’s for messages in Mp . Therefore, we seek xp such that: X [xm − PM |p (m)|S|]+ . (3) xp = arg min xp

m∈Mp

As mentioned earlier, each message is mapped to at least one seed as the scheme is not lossy. Hence, we should have: xm ≥ 1, for all m ∈ Mp . It can be shown that Algorithm 1 finds a solution to (3), i.e., xp which satisfies this constraint. The proof is omitted due to space constraints. The output of Algorithm 1 satisfies xm − PM |p (m)|S|≤ 1 for all m. Therefore, 1 X N [xm − PM |p (m)|S|]+ ≤ . (4) |S| |S| m∈Mp

We state this result formally in the following lemma: Lemma 4. For all probability distributions PM |p over set {1, · · · , N } and all integers |S|≥ N , Algorithm 1 outputs a vector xp ∈ NN such that the variational distance between xp N |S| and PM |p is less than or equal to |S| . This result implies that for seeds of length l bits, in order to bound term (I) by L, the size of the partition should satisfy N ≤ L2l . 2) Upper Bound on term (II): We note that PC|S (c|s) = PK (ks,c ), where ks,c is the key that maps the seed s into the ciphertext c. Therefore, X X 1 X 1 X max PC|S (c|s) = max PC|S (c|s) m∈Mp |S| |C| m∈Mp c c s∈Sm s∈Sm X = EC∼unif (|C|) [ max PK (ks,C )]. m∈Mp

P

s∈Sm

maxm∈Mp s∈Sm PK (ks,c ) could be interpreted as the maximum load of a bin for a special outcome c in a balls into 1 , and N bins with bins game with |K| balls with weights |K| xm weights QM |p (m) = |S| , m ∈ Mp . We are interested in the expected value of the maximum load of a bin. We denote this expected value by E[LQM |p ]. The idea of interpreting term (II) by a balls into bins game is also present in [4], but the analysis presented here is improved.

We model the block cipher as a random function such that for each key k, the function fk : s → c is a random bijection. Since all keys have equal probability, the load of a bin m is 1 |K| × number of balls in the bin m. Note that in the best case scenario, if the block cipher is designed such that each key maps a ciphertext back to a different message (i.e., there is at most one ball in each bin), then 1 1 E[LQM |p ] = E[ × 1] = . |K| |K| 1 is the best bound on (II) and hence on Therefore, |K| P (Recov.|p) that one can hope to achieve. The following lemma, gives an upper bound on the described balls into bins problem. We omit the proof due to space constraints.

Lemma 5. Let a be the number of balls which have weights 1 |K| , and b denote the number of bins which have capacities QM |p (m), m ∈ Mp√(a b). Define γ = maxm PM |p (m) + 1 |S| . If 0 ≤ γ ≤ 3 − 5, then using Algorithm 1, the expected maximum load of a bin is upper bounded by: a ˆ 1 2 (1 + ), (5) E[LQM |p ] ≤ ˆb − a |K| ˆ+2 2 where a ˆ = 3a and ˆb = b c. γ proof of Theorem 2. From Lemma 5 X 1 X 9|K|2 −3|K| 1 max P (c|s) ≤ (1 + ) . 2 m∈Mp |S| 2(b γ c − 3|K|+2) |K| c s∈Sm (6) 1 Abbreviate this bound as (1 + p ) |K| . From Lemma 2 and expressions (4) and (6), we have X P (Recov.) = P (p)P (Recov.|p) p

N |K| 1 (1 + + p ) |K| |S| p 1 N 1 X = + + P (p)p . |K| |S| |K| p

≤

If maxm PM |p (m) ≤

X

1 6|K|

P (p)

−

1 |S| ,

then 6|K|≤

1 γ.

Therefore,

9|K| ≤ 3γ|K|2 − 6|K| 1 = 3|K|2 + 3|K|2 max PM |p (m). m |S|

p ≤

2 γ2

Therefore, X N + 3|K| 1 P (Recov.) ≤ + +3|K| P (p) max PM |p (m). m |K| |S| p Applying Lemma 1, the result is immediate. Remark 1. As mentioned in subsection III-A, we sort the messages before partitioning. This is because as evidence by expression (6), P (Recov.|p) is proportional to γ. Hence, in order to get a better bound, we would like to minimize

maxm PM |p (m). But maxm PM |p (m) ≥ N1 with equality when messages are uniform. Sorting the messages and then partitioning them, brings the distribution in each partition close to uniform. Remark 2. Note the tradeoff between terms (I) and (II): If we were to increase the number of messages in partitions, the upper bound in (4) would increase, but it would generally decrease γ resulting in a decrease of the upper bound in (6). 1 Remark 3. If maxm Pm|p ≤ c|K| 2 , then from the last expression in the proof of Theorem 2 N + 3|K| 3 1 + + . P (Recov.) ≤ |K| |S| c|K|

Thus for large |S| and c, an upper bound close to be obtained.

1 |K|

could

IV. C ONCLUSION We presented message partitioning technique which could be used in a symmetric key cryptosystem to increase its information-theoretic security. We considered a newly proposed encryption scheme called Honey Encryption, which addresses the loss in security due to non-uniformity of the messages, using a DTE block. This block using local private randomness maps the nonuniform messages to a larger set of messages called seeds that are nearly uniform. We proposed a DTE scheme which has a good performance under limited available auxiliary randomness and combined it with our message partitioner. We derived achievable bounds on the security of message partitioning alone and its use in conjunction with our proposed DTE scheme. Under certain conditions the schemes nearly achieve the information-theoretic converse on the probability of the success of the adversary. Adding a partitioning stage to the Honey Encryption setup can reduce the resources required to achieve a particular performance level. It is not clear whether a hybrid scheme can match the performance of partitioning alone. Since all of our results are achievablity, they do not establish the precise performance of either scheme. Converse bounds are needed for comparison, which remain as future work. V. ACKNOWLEDGMENT This work was in part supported by MURI grant ARMY W911NF-15-1-0479 and NSF grant CCF 10-54937-CAREER. R EFERENCES [1] Vernam, G. S., “Cipher printing telegraph systems for secret wire and radio telegraphic communications,” Journal of of American Institute for Electrical Engineers, 45: 109-115, 1926. [2] Shannon, C. E., “Communication theory of secrecy systems,” Bell system technical journal 28, no. 4: 656-715, 1949. [3] Katz, J. and Lindell, Y., “Introduction to modern cryptography,” CRC Press, 2014. [4] Juels, A. and Ristenpart, T., “Honey encryption: Security beyond the brute-force bound,” Advances in Cryptology-EUROCRYPT 2014. Springer Berlin Heidelberg, 2014. 293-310. [5] Huang, Z., Ayday, E., Fellay, J., Hubaux, J. P. and Juels, A., “GenoGuard: Protecting genomic data against brute-force attacks,” Security and Privacy (SP), 2015 IEEE Symposium on, 2015. [6] Tyagi, N., Wang, J., Wen, K. and Zuo, D., “Honey Encryption Applications,” Network Security, 2015.