Multiterminal Secret Key Agreement

Viewer
Transcript

1

Multiterminal Secret Key Agreement Chung Chan and Lizhong Zheng

Abstract—The problem of secret key agreement by public discussion is studied under a general multiterminal network where each user can both send and receive over a private channel. For the maximum achievable key rate, single-letter upper and lower bounds are derived together with a necessary condition as well as some different sufficient conditions for tightness. The bounds are shown to match for a large class of private channels. A counter-example is also discovered, showing not only that the bounds do not match in general but a better cooperative scheme can narrow the gap.

I. I NTRODUCTION The idea of secret key agreement is as follows. A group of users want to communicate messages among themselves securely. They want to do so by first agreeing on a common secret key and then using the key to encrypt the messages, for instance, by the well-known one-time pad, which is provably secure in the information-theoretic sense if the key rate is no smaller than the message rate [3]. Doing it this way, the problem of secure communication effectively turns into the problem of agreeing on a secret key. To obtain a secret key, the users first generate some correlated observations using private resources such as a wiretap channel or a distributed random source. In the case of a wiretap channel between two users and a wiretapper, one may apply coding like [4, 5] to send out the desired key from one user to the other securely in the presence of a wiretapper who observes a more noisy channel output than the other user. If the private resource is available as a distributed source instead, the correlated source components of the users may not be readily available in the form of a secret key because they may not be identical for different users. The solution is to allow the users to discuss further over a separate public channel so that they can agree on a common randomness by encoding their sources. This process is called information reconciliation. The resulting common randomness still may not be used as the key, however, since it is observed partly by the wiretapper who listens to the entire public discussion. The users can instead compute the desired key as a function of the common randomness, which has to be carefully chosen to be the portion that is not leaked to the wiretapper. This is called privacy amplification. The basic concepts of secret key agreement originate was first introduced in the study of quantum key distribution [6]. A general source model was given in [7, 8] involving two users who observes correlated sources. The public discussion channel is usually assumed to be authenticated, noiseless and Chung Chan ([email protected]) was with Research Laboratory of Electronics at MIT, Massachusetts Institute of Technology. He is currently with the Institute of Network Coding at the Chinese University of Hong Kong. This work is part of his Doctoral Thesis [2] at MIT available online at [1]. Lizhong Zheng is with Research Laboratory of Electronics at MIT, Massachusetts Institute of Technology.

unlimited in rate but revealed completely to the wiretapper. The discussion may be interactive as in [7] in the sense that the users can take turns to speak and reply to each other. It can also be restricted to be one-way as in [8] where only one user can speak. The source model was further extended to the multiterminal case in [9] where multiple users can observe different components of a private distributed source and then engage in an interactive public discussion. It was shown in [9] that the maximum key rate achievable, called the secrecy capacity, can be characterized by a simple linear program [9] when the wiretapper observes only the public discussion and possibly the observations of a subset of the users. The capacity has an intuitive interpretation as the multivariate correlation of the distributed source, and was shown to carry other operational meanings as the combinatorial notion of partition connectivity and the capacity of undirected network [10–12]. Indeed, public discussion not only makes secret key agreement possible under a source model, but it also improves the secret key rate for the channel model as pointed out in [13]. Especially when the wiretapper has a less noisy observation than the other user, it may not be possible to send a key both reliably and securely at positive rate because the wiretapper can decode whenever the user can. The users can instead generate some random sequence as the channel input, turning the channel effectively into a distributed source with input and output sequences as the correlated but possibly different source components. Then, they can publicly discuss to reconcile their information into a common randomness and extract the secure portion as the secret key. Unless the channel output to the user is a physically degraded version of the observation of the wiretapper, the common randomness may not be entirely observed by the wiretapper, and so a secret key of strictly positive rate is possible as illustrated in [13]. This process of turning the channel into a source is called source emulation. Compared to the source model, the additional step of source emulation makes the problem more interesting because the users can control the channel input to enhance the correlation of the observations. The channel model involving two users was given in [7, 8]. It was extended later in [14] to a general broadcast channel with one sender and any number of receivers. Our goal is to extend the model further to a large private network, where it is more realistic to expect multiple channels relating different subsets of the users, instead of a single broadcast channel covering all users. It should cover, for instance, the usual graphical model for network coding [15], where every user in the network may both transmit and receive packets over the network. With more than one sender in the network, there may potentially be some form of cooperation in the choice of the channel inputs that can enhance the correlation beyond a pure source emulation approach of generating independent channel inputs.

2

In the sequel, we will extend the broadcast channel model to a general multiterminal private channel model in §II where every user in the network may both transmit and receive over the channel. Single-letter upper and lower bounds for the secrecy capacity, referred to as the secrecy upper and lower bounds, are derived in §III in the form of certain minimax expressions. The lower bound is achieved by a new cooperative scheme called mixed source emulation. It employs the wellknown idea of mixed strategy [16] by considering the minimax characterization as the payoff of a virtual zero-sum game. In §IV, we give a simple example for which mixed source emulation strictly outperforms the conventional pure source emulation approach. To understand the optimality of mixed source emulation, a necessary condition for tightness of the secrecy bounds is derived in §V, and some general sufficient conditions derived in §VI. The secrecy upper and lower bounds match for many classes of channels but not in general. In particular, there is an example in §VII for which not only is there a gap between the bounds but the mixed source emulation approach is also shown to be suboptimal. Users can somehow adapt their channel inputs to their channel outputs to further enhance the correlation among their private observations. After submitting this work, we were informed of a related work by Csiszár and Narayan [17] on a channel model with multiple senders. We would like to clarify below some similarities and differences. 1) For the private channel model, [17] considers the special case where each user can either transmit or receive but not both. We consider here a general multiterminal network where each user can both transmit and receive. As a result, secrecy capacity is characterized for a wider class of channels than was previously known. In particular, Theorem 1 and 2 of [17] are special cases of the secrecy lower and upper bounds derived here. 2) [17] also considers the capacity region when different subsets of users want to share different keys. This is a meaningful extension we have not considered. We focus on the problem of generating a common secret key. In particular, Theorem 3 of [17] is not covered anywhere in our work. 3) Example 1 and 4 in [17] are also considered here. However, we have extended the optimality of the source emulation approach to arbitrary finite linear networks and the more general homomorphic channels defined in §VI. The network coding approach to secret key agreement is also studied further in [2, 18, 19]. The result is also useful in the study of undirected networks [12] and secure network coding [20, 21]. 4) The idea of mixed source emulation is also considered in [17]. While it was claimed to be suboptimal using Example 3 in [17], it is indeed optimal for that example.1 We have an example under the special model in [17] for which the secrecy lower and upper bounds do not match. This does not directly imply that mixed source emulation is suboptimal, however, since it can be the secrecy upper bound that is loose instead of the lower bound. We have a way to modify the example under the more general channel 1 Choose T uniformly random over {1, 2}, X uniformly random over T lower bound in [17, {0, 1} but X2−T = 0. Then, using the result of [10], the ∏ 1 Theorem 1] can be written as minP |P|−1 D(PXV |T ∥ C∈P PXC |T |PT ) where P is a set partition of V into at least two non-empty parts. It is easy to check that the minimum is 0.5 bits, which is the secrecy capacity.

model proposed here such that mixed source emulation can be proved suboptimal. This is done by exploiting the fact that a user can both transmit and receive over the private channel and so adapt his channel input to his output. For the special model in [17], however, this is not possible and so it remains open whether mixed source emulation is optimal in that case. II. G ENERAL M ULTITERMINAL N ETWORK As in [9], the model consists of a wiretapper and a set V := [m] = {1, . . . , m} of m users, which are further categorized as active users, untrusted users and helpers as follows. Active users: A ⊆ V : |A| ≥ 2 denotes the set of at least two active users who want to share a secret key. Untrusted users: D ⊆ Ac := V \ A denotes the (possibly empty) set of untrusted users. They help the active users agree on a secret key but cannot keep any secret from the wiretapper. Helpers: V \ (A ∪ D) corresponds to the remaining trusted users who help generate the secret key. The users have access to a private discrete memoryless multiterminal channel (DMMC), characterized by the transition probability PYV |XV where XV := {Xi : i ∈ V } and YV denote the collections of channel inputs and outputs respectively, with the i-th elements Xi and Yi for user i ∈ A. It covers the source model in [9] as a special case when the users may receive but not transmit. It also covers the multiterminal broadcast channel model in [14] as a special case when there is at most one transmitting terminal, which is not allowed to receive. The more general DMMC addresses the following new questions: (a) how should the users coordinate their channel inputs to enhance the correlation of their private observations? (b) how should the users adapt their inputs to their outputs? We will also consider continuous channels, which arise in practical scenarios such as wireless communication. Every channel input and output symbol is assumed to be a mixture of discrete and continuous random variables as in the framework for the generalized entropy in [22] described in ‡A in the appendices. The channel input sequences may also be subject to additional constraints such as the usual average power constraint. In addition to the private DMMC, each user can also broadcast messages to everyone at any time over an authenticated public channel noiselessly at unlimited finite rate. The wiretapper can observe all the messages transmitted over the public channel and the private knowledge of the untrusted users, including their inputs and outputs of the private DMMC. After n uses of the private DMMC, the active user wants to recover some secret key that is uniformly distributed and independent of the wiretapped information asymptotically as the block length n increases. The detailed protocol is as follows. As shown in Fig. 1, the secret key agreement protocol is divided into three main phases: 1) randomization, 2) transmission, and 3) key generation. The objective of the randomization phase is to create the randomness for the channel inputs and public messages later in the transmission phase.

3

.A (active)

.n + 1

.1

.t = 0

. . . 1. .

. 2.

.(helper) . D (untrusted) . 3.

. 4.

.T1 : U0 , U1 ,.Y1t−1 , Ft−1 .

.Randomization phase .

.

.∗ U. 0

.

.

. .

.U.1

.U.2

.U.3

.

.public randomization .

. U. 4

.private randomization .

.. .

∗

.Transmission phase

. . PYV.|XV . (DMMC)

.

.

.X11 .

.X21 .

.X31 .

∗

. . .

.Y11 .

.Y21 .

.Y31 .

∗

∗

. F.11

.

.

.

.. . ... .. . .. .

∗

.

. X.41

.private channel . use

.. Y.41

.

.

.public discussion .

. F.12

.

.

.

.

∗

F1j from j = 3 to r. . . .

.Tm : U0 , U4., Y4t−1 , Ft−1

.K.2

.

.

. . PU. 0

∗

. U0

.

PU1 |U0 PU2 |U0

.

.(public)

.U1

.

.(private to T1 ) .U2

.

.(private to T2 )

.. . .

.

.Ymt

.

.

.K.1

.Xmt

Fig. 3. Private channel use: Ti generates channel input Xit at time t using his accumulated knowledge (U0 , Ui , Yit−1 , Ft−1 ).

(XV t , YV t , ∗ Ft[r] ) from t = 2 to n. . . . . . .Key generation phase

. .

Fig. 2.

.

. 1t Y .X2t .Y2t

Fig. 1. Timeline for the secret key agreement protocol: A = [2], D = {4} and V = [3]. Entries in red marked with ∗ are observed by the wiretapper.

.

.T2 : U0 , U2 ,.Y2t−1 , Ft−1

.X1t

PUm |U0

.Um

.

.(private to Tm )

Randomization: Ti denotes user i ∈ V .

Randomization phase: Public randomization: At time t = 0, the users publicly randomize by agreeing on a public continuous random variable U0 , known also to the wiretapper. For definiteness, we can have user 1 generate U0 without loss of generality. Private randomization: Every user i ∈ V then generates privately a continuous random variable Ui conditionally independent given the public randomization U0 , i.e. ∏ PU0 UV = PU0 PUi |U0 . (2.1) i∈V

This is illustrated in Fig. 2. In the transmission phase, the users use the private DMMC n times, and publicly discuss after every channel use for an arbitrary finite number r of rounds so that they can decide on the next channel input or generate the secret key in the final key generation phase.

Transmission phase – private channel use: 1) As illustrated in Fig. 3, user i ∈ V chooses at every t ∈ [n] := {1, . . . , n} a valid input Xit as a function, denoted as Xit (· · · ), of his accumulated knowledge, i.e. Xit = Xit (U0 , Ui , Yit−1 , Ft−1 ) ∈ Xi

(2.2)

where := (Yiτ : τ ∈ [t − 1]) and Ft−1 := (Fτ : τ ∈ [t−1]) are vectors of accumulated observations and public messages respectively. The input needs not be subject to the finite alphabet constraint that Xi is a finite set. It may be infinitely-valued by having discrete components with unbounded support or continuous components with absolutely continuous probability measures. It may also be subject to the sample average constraint in Definition 2.1. 2) After the channel input XV t is completely specified, the channel generates YV t according to PYV |XV and returns Yit privately to each user i ∈ V . Yit−1

Definition 2.1 The sample average constraint is characterized by a finite set of functions ϕi : Xi 7→ Rℓ indexed by i ∈ V . It is satisfied by the channel input sequence XnV if for every n,   1 ∑  Pr ϕi (Xit ) ≤ δn · 1 , ∀i ∈ V = 1 (2.3) n  t∈[n]

for some δn → 0, where 1 is a vector of ℓ ones and ≤ denotes the element-wise inequality. This specializes to the usual average power constraint for some ϕV quadratic on the real-valued components of the channel input. 2 Transmission phase – public discussion: Right after the private channel use at time t ∈ [n], the users engage in an interactive authenticated public discussion. Let r be the total number of rounds, and ij be the user that speaks at the j-th round. Then, at the j-th round of time t, user ij generates and broadcasts the following public message noiselessly as a finitely-valued function of his accumulated knowledge. Ftj := Ftj (U0 , Uij , Yitj , Ft−1 , Ft[j−1] )

(2.4)

4

where Ft−1 := (Fτ : τ ∈ [t − 1]) and Ft := Ft[r] is the vector of public messages at time t after the t-th private channel use. The public discussion is allowed to be interactive in the sense that Ftj can be a function of the previous messages Ft[j−1] . It is authenticated in the sense that every user knows the sender of every public message and the wiretapper cannot tamper with the messages. In the key generation phase, each active user generates an individual secret key based on the accumulated knowledge of his public and private observations. The goal is to maximize the key rate (bits per private channel use) under two constraints: 1) the recoverability condition that the individual keys are the same with high probability, and 2) the secrecy condition that the keys are close to uniformly random even given the accumulated knowledge of the wiretapper. Key generation phase: At time n + 1, every active user i ∈ A generates a finitelyvalued individual key Ki ∈ K as Ki := Ki (U0 , Ui , Yin , Fn ) ∈ K

(2.5)

such that Ki ’s are asymptotically the same and secure. More precisely, there exists a random variable K ∈ K satisfying Pr{∃i ∈ A, Ki ̸= K} ≤ ϵn → 0

(recoverability)

(2.6)

sdiv ≤ δn → 0

(secrecy)

(2.7)

for some non-negative sequence ϵn and δn where sdiv is the secret leakage rate defined as, 1 n U U ∥UK ) (2.8a) D(PK|Fn YD D 0 n 1 n UD U0 )] (2.8b) = [log|K| − H(K|Fn YD n where UK denotes the uniform distribution over K, D is the information divergence and H is the entropy [23]. sdiv :=

n.b. the complete knowledge of user i ∈ V is, (U0 , Ui , Xni , Yin , Fn , Ki ) and the complete knowledge of the wiretapper is, n (U0 , UD , XnD , YD , Fn ).

We can remove the channel input Xni and individual key Ki without loss of generality since they can be determined by the remaining knowledge of each entity according to the causal relations (2.2) and (2.5). Definition 2.2 (Secrecy capacity) We use the term secrecy scheme to refer to the choice of the sequence in n of 1) distributions PU0 and (PUi |U0 , i ∈ Dc ) for public and private randomizations respectively, 2) private channel input functions (Xit : i ∈ V, t ∈ [n]), 3) public discussion functions (Ftj : t ∈ [n], j ∈ [r]), including the choice of r and the order (ij ∈ V : j ∈ [r]) of discussion, and 4) key functions (Ki : i ∈ A), including the choice of the set K of possible keys.

For notational simplicity, we have made the dependence on the constraint length n implicit. K for instance is growing exponentially in n for the case of interest. Any non-negative rate R is said to be achievable if R ≤ lim inf n→∞ n1 log|K| for a secrecy scheme satisfying the recoverability (2.6) and secrecy constraints (2.7). It is said to be strongly achievable if ϵn , δn → 0 exponentially in n. It is said to be perfectly achievable if ϵn , δn = 0 for sufficiently large n. The largest achievable key rate is called the secrecy capacity. 1 log|K| (2.9) n→∞ n Upper and lower bounds on the secrecy capacity will be referred to as the secrecy upper and lower bounds respectively.2 Cs := sup lim inf

For simplicity, we will not consider the achievable convergence rates of the error ϵn and leaked information δn . However, this is possible as in [24, 25] by extending the privacy amplification theorem in [26, 27] to the multiterminal case. Perfect secrecy capacity is also studied in [18, 19, 28] but will not be considered here. It is also possible to consider a more general wiretap model such as the statistical model in [9] with an additional output of the private DMMC to the wiretapper, or the universal model like [20] with the wiretapper choosing one out of a given class of functions of the transmitted data to observe. One may also consider an active intruder who can control an input of the private DMMC, tamper with or inject any fraudulent messages into the transmitted data as in [29]. For simplicity, however, we will not consider such generalizations here because, even under the simplest two-user source model in [7], the secrecy capacity is unknown under the statistical wiretap model. Readers may refer to [30, 31] for techniques of bounding the capacity under the statistical wiretap model. III. S ECRECY C APACITY We will derive upper and lower bounds on the secrecy capacity systemically under the following special models. Finitely-valued model: The channel input and output alphabets are all finitely-valued. Infinitely-valued source model: The users can receive infinitely-valued output but cannot send any channel input. Channel model with finite-input alphabet only: The channel input alphabets are all finite but the output alphabets are not necessarily finite. The solutions to these special cases and the finitely-valued source model in [9] will be used to compose the solution to the general infinitely-valued channel model with the sample average constraint. A. Secrecy upper bound We first derive single-letter upper bounds on the secrecy capacity. To simplify the discussion here, some of the mathematical tools are introduced in the appendices. In particular, we will use the information measures defined in ‡A, the Shearertype lemma in ‡C and the secrecy expressions in ‡F.

5

∑ H(K|·∗) ≥ 0 by (A.5). B λB H(K|∗) ≤ |Dc | maxB H(K|∗) because ( ) ∑ ∑ ∑ λB ≤ λB = |Dc |

From the secrecy condition (2.7) and (2.8b), we have 1 1 n log|K| ≤ H(K|Fn YD UD U0 ) + δ n . n n The objective is single-letterize the upper bound by replacing K, Fn , U0 and UD with the channel statistics PYV |XV . We first eliminate the dependence on K as follows. n H(K|Fn YD UD U0 ) n n n = H(K|Fn YVn UV U0 ) + I(K ∧ YD c UD c |F YD UD U0 ).

The first entropy term on the R.H.S. is negligible (sublinear in n) by Fano’s inequality [23].

i∈D c

B

B∋i

by ∑ the constraint (C.2) on fractional partitions that B∋i λB = 1. The last entropy term H(K|∗) is negligible by Fano’s inequality (3.1). Thus, we have eliminated the dependence on K as desired. i.e. for some δn′ → 0, [ 1 1 n n n log|K| ≤ H(UDc YD c |F YD UD U0 ) n n ] ∑ n n n − λB H(UB YB |F YB c UB c U0 ) + δn′ .

Fano’s inequality:

B∈HA|D

H(K|Fn Yin Ui U0 ) ≤ h(ϵn ) + ϵn log|K|

∀i ∈ A

(3.1)

where h(p) is the binary entropy function, h(p) := −(1 − p) log(1 − p) − p log p.

(3.2)

This is because the conditional (Fn , Yin , Ui , U0 ) determines the individual key Ki for every active user i ∈ A by (2.5), which in turn determines K almost surely by (2.6). The remaining n n n mutual information I(K ∧ YD c UD c |F YD UD U0 ) is n n n n n n H(UDc YD c |F YD UD U0 ) − H(UD c YD c |KF YD UD U0 )

by the definition of mutual information. The last entropy term can be bounded by the weak form of the Shearer-type lower bound (C.3a) as n n n H(UDc YD c |KF YD UD U0 ) ∑ n n λB H(UB YB |KFn YB ≥ c UB c U0 ) B∈HA|D

where λ ∈ ΛA|D is any fractional partition defined in (C.2) over the hypergraph HA|D defined in (C.1). While the bound holds for fractional partitions of any hypergraph, the reason we restrict to HA|D is to ensure that B c in the resulting expression intersects A. This in turn ensures that the conditions in the conditional entropy expressions determine at least one individual key by (2.5) so that Fano’s inequality (3.1) applies. More precisely, ∑ n n λB H(UB YB |KFn YB c UB c U0 ) B∈HA|D

=

∑

It remains to eliminate the dependence on the randomizations and public messages. Rewrite the last inequality as [ 1 1 ∑ ′ n n log|K| ≤ δn + λB H(UB c YB c F |U0 ) n n B ( ) ] ∑ n n − H(UD YD F |U0 ) − λB − 1 H(UV YVn |U0 ) (3.3) B

using the expansion that n n n H(UB YB |F YB c UB c U0 ) n n = H(UV YVn |U0 ) − H(UB c YB c F |U0 )

and the same expression with B replaced by Dc and B c replaced by D. This equation follows from the chain rule and H(UV YVn Fn |U0 ) = H(UV YVn |U0 ) since (U0 , UV , YVn ) completely determines Fn by an inductive argument on (2.4). To single-letterize the bound, the entropy terms in (3.3) is further expanded causally as follows. Causal expansion: n n H(UB c YB c F |U0 ) (a)

=H(UB c |U0 ) +

∑[ t−1 H(YB c t |Ft−1 YB c UB c U0 ) t∈[n]

t + H(Ft |Ft−1 YB c UB c U0 ) n n H(UD YD F |U0 ) (b)

= H(UD |U0 ) +

∑[

]

t−1 H(YDt |Ft−1 YD UD U0 )

t∈[n]

[

+ H(Ft |F

n n n λB H(UB YB |F YB c UB c U0 )

B n n n − I(K ∧ UB YB |F YB c UB c U0 ) ∑ n n n ≥ λB H(UB YB |F YB c UB c U0 )

]

B n − |Dc | max H(K|Fn YB c UB c U0 ) B

where the last inequality is obtained by rewriting the mutual information I(K ∧ ·|∗) = H(K|∗) − H(K| · ∗) with · and ∗ n n denoting (UB , YB ) and (Fn , YB c , UB c , U0 ) respectively. We have I(K ∧ ·|∗) ≤ H(K|∗) because K is discrete and so

t−1

H(UV YVn Fn |U0 ) (c)

= H(UV |U0 ) +

] t YD UD U0 )

∑[

H(YV t |Ft−1 YVt−1 UV U0 )

]

t∈[n]

(a) by the chain rule expansion in the causal order illustrated in Fig. 1. (b) same as (a) with B c replaced by D. (c) same as (a) with B c replaced by V . We have also used (2.4) that (U0 , UV , YVt ) completely determines Ft , which implies that H(Ft |Ft−1 YVt UV U0 ) = 0.

6

After applying these causal expansions to (3.3), we can regroup similar terms together and simplify them using the Shearer-type lemma as follows. Applying Shearer-type lemma: ∑ ∑ λB H(UB c |U0 ) − H(UD |U0 ) − ( B λB − 1) H(UV |U0 ) B (a)

= H(UDc |U0 ) −

∑

(b)

λB H(UB |UB c U0 ) = 0

B

∑

optimizing over all achievable schemes, we have [ 1 1 ∑ ∑ sup log|K| ≤ sup inf λB H(YB c t |XB c t Qt ) λ n n B t∈[n] ] ∑ − H(YDt |XDt Qt ) − ( B λB − 1) H(YV t |XV t ) + δn′ where the supremum on the R.H.S. is over PXnV |Qn , PQn , with PXnV |Qn restricted to a collection of valid input distributions, and Qn relaxed from (3.5) to any mixture of discrete and continuous random variables that satisfies the Markov chain

(c)

Qt ↔ XV t ↔ YV t

t t−1 t λB H(Ft |Ft−1 YB YD UD U0 ) ≤ 0 c UB c U0 ) − H(Ft |F

B

(a) by the conditional independence (2.1) of Ui ’s given UD . (b) by the equality case (C.3b) of the Shearer-type lemma. (c) by (C.3c) of the Shearer-type lemma for the causal relation (2.4), where F[r] denotes the r-round public discussion. Putting these together, (3.3) becomes

= inf

sup

λ PQn ,PXn |Qn

] 1 ∑ [ E α(λ, PXV t |Qn (·|Qt )) + δn′ n t∈[n]

V

B

t−1 − H(YDt |Ft−1 YD UD U0 )

] ∑ − ( B λB − 1) H(YV t |Ft−1 YVt−1 UV U0 ) + δn′ (3.4) We can now eliminate the randomizations and public messages by the channel inputs as follows. Inserting channel input: Define t−1 Qt := (Ft−1 , YD , UD , U0 )

(3.5)

by the definition of α in (F.1a) and the Markov property (3.6). It is optimal to choose Qn deterministic by the trivial fact that the supremum of α over Qn is always no less than any averaging over Qn . In summary, we have 1 ∑ 1 α(λ, PXV t ) + δn′ sup log|K| ≤ inf sup (3.7) λ PXn n n Theorem 3.1 (Finite-input-alphabet constraint) If the input to the private channel is subject to the finite-input-alphabet constraint only2 , then the secrecy capacity (2.9) satisfies Cs ≤ min

max

α(λ, PXV )

λ∈ΛA|D PXV ∈P(XV )

(a)

t−1 t−1 t−1 H(YB c t |Ft−1 YB YB c UB c U0 ) c UB c U0 ) = H(YB c t |XB c t F (b)

≤ H(YB c t |XB c t Qt )

(c) t−1 H(YDt |F YD UD U0 ) = H(YDt |XDt Qt ) (d) H(YV t |Ft−1 YVt−1 UV U0 ) = H(YV t |XV t Ft−1 YVt−1 UV (e) = H(YV t |XV t )

t∈[n]

V

Then, the entropy terms in (3.4) become

t−1

(a) (b) (c) (d) (e)

(3.6)

This Markov chains come from the memorylessness property of the DMMC with the original definition (3.5) of Qt . Since exchanging the sup and inf can only increase the bound, [ 1 1 ∑ ∑ sup log|K| ≤ inf sup λB H(YB c t |XB c t Qt ) λ n n B t∈[n] ] ∑ − H(YDt |XDt Qt ) − ( B λB − 1) H(YV t |XV t ) + δn′

[ 1 1 ∑ ∑ t−1 log|K| ≤ λB H(YB c t |Ft−1 YB c UB c U0 ) n n t∈[n]

∀t ∈ [n].

U0 )

by the causal relation (2.2). by the fact that conditioning reduces entropy (A.4). same as (a) with B c replaced by D. same as (a) with B c replaced by V . by the memorylessness assumption of the DMMC.

Substituting these into (3.4), we have [ 1 1 ∑ ∑ log|K| ≤ λB H(YB c t |XB c t Qt ) n n B t∈[n] ] ∑ − H(YDt |XDt Qt ) − ( B λB − 1) H(YV t |XV t ) + δn′ Tightening the bound by minimizing over λ ∈ ΛA|D and then

(3.8)

where α is defined in (F.1), and the fractional partition λ ∈ ΛA|D is defined in (C.2) over the hypergraph HA|D in (C.1). P(XV ) denotes the set of all distributions over XV . 2 P ROOF We specialize (3.7) to the case where the channel input symbols are subject to the finite-alphabet constraint only. (a) 1 1 ∑ sup log|K| ≤ inf sup α(λ, PXV t ) + δn′ λ n n PXV t t∈[n]

(b)

= inf sup α(λ, PXV ) + δn′ λ PX V

(c)

= min max α(λ, PXV ) + δn′ λ

PXV

(a) We can push the supremum inside the summation in (3.7) because, - the t-th summand α(λ, PXV t ) depends on PXnV only through PXV t , and - the finite-alphabet constraint on the support of PXnV is separable into independent finite-alphabet constraints on the support of PXV t . 2 The

output can be continuous even though the input has be finitely valued.

7

(b) By symmetry, α(λ, PXV t ) has the same supremum independent of t. (c) The infimum and supremum can be replaced by the minimum and maximum since α is a continuous function over the compact set ΛA|D × P(XV ). See [32] for a detailed derivation of the continuity of information measures. Finally, taking the limit as n → ∞ gives the desired bound (3.8). Example 3.1 Consider V = [3], A = [2], D = {3}, and the private DMMC PY|X1 X2 . The active users 1 and 2 control the finitely-valued channel input X1 and X2 respectively, and the untrusted user 3 observes Y. By (3.8), the secrecy upper bound simplifies to max

PX1 X2 ∈P(X1 ×X2 )

[I(X1 ∧ X2 |Y) − I(X1 ∧ X2 )] .

(3.9)

n.b. the minimization in (3.8) is trivial since ΛA|D is a singleton, containing only one fractional partition λ, namely the one with λ{1} = λ{2} = 1. Consider, in particular, the binary multiple access channel Y = X1 ⊕ X2 Since I(X1 ∧ X2 |Y) ≤ H(X1 ) ≤ 1 in (3.9), the secrecy upper bound is 1 bit with the optimal distribution PX1 ,X2 (x1 , x2 ) = Bern 21 (x1 ) Bern 21 (x2 ), where Bernp denotes the Bernoulli distribution with probability p. 2 For the more general case when the input can be a mixture of continuous and discrete random variables subject to the sample average constraint in (2.3), we will derive a weaker secrecy upper bound that uses the following moment constraint, which can be regarded as the single-letterization of the sample average constraint. Definition 3.1 (Moment constraint) XV satisfies the moment constraint ϕi : Xi 7→ Rℓ for i ∈ V if E [ϕi (Xi )] ≤ 0

∀i ∈ V.

Theorem 3.2 (Sample average constraint) With the sample average constraint in (2.3) but not necessarily the finite-inputalphabet constraint, i.e. allowing the input to be continuous, we have the following secrecy upper bound, sup minc αi (λ, PXV )

inf

λ∈ΛA|D PX i∈D V

(3.11)

where αi is defined in (F.4) and the input distribution is subject to the moment constraint in (3.10). 2 P ROOF By (F.8b), α(λ, PXV ) ≤ mini∈Dc αi (λ, PXV ) and so (3.7) can be weakened to z

f (λ,PXn ):=

}|V { 1 1 ∑ sup log|K| ≤ inf sup minc αi (λ, PXV t ) +δn′ . (3.12) λ PXn i∈D n n V

sup

1 log|K| = inf sup min αi (λ, P¯XV ) + δn′ i λ PXn n V

≤ inf sup min αi (λ, PXV ) + δn′ λ PX V

i

where PXV in the last supremum is subject to the moment constraint (3.10) because P¯XV satisfies the moment constraint. To see this, consider Xi distributed as P¯Xi and Xit distributed as PXit for t ∈ [n]. Then,   ∑ 1 ϕ(Xit ) ≤ δn · 1 E [ϕi (Xi )] = E  n t∈[n]

where the last inequality follows directly from the sample average constraint (2.3). Finally, taking the limit as n → ∞ gives the desired bound (3.11). We can also rewrite the infimum in λ as the minimum since supPX mini αi (λ, PXV ) V is continuous in λ over the compact set ΛA|D . Example 3.2 Consider as in Example 3.1 the case V = [3], A = [2], D = {3} and the DMMC PY|X1 X2 , where the active terminals 1 and 2 control X1 and X2 respectively under the sample average constraint ϕ[2] in (2.3), and the untrusted terminal 3 observes Y. By (3.11), the secrecy upper bound simplifies to, max

PX1 X2 ∈P(X1 ×X2 )

min {I(X1 ∧ Y|X2 ), I(X2 ∧ Y|X1 )} (3.13)

where PX1 X2 is subject to the moment constraint E(ϕ1 (X1 )) ≤ 0

and

E(ϕ2 (X2 )) ≤ 0.

(3.10)

It specializes to the power constraint if ϕV is quadratic on the real-valued components of XV . 2

Cs ≤

and average of By Jensen’s ∑ concave functions are concave. ∑ inequality, n1 t∈[n] f (λ, PXnV ) ≤ f (λ, n1 t∈[n] PXnV ). With ∑ P¯XV := n1 t∈[n] PXnV , (3.12) becomes

t∈[n]

f defined above is concave in the input distribution because αi is concave according to Corollary F.1, and the minimum

Consider, in particular, the Gaussian multiple access channel Y = X1 +X2 +N where N ∼ N0,1 is a zero-mean unit-variance Gaussian channel noise. The input Xi is subject to the average power constraint ϕi (xi ) = x2i −Pi for some given Pi > 0. The secrecy upper bound becomes log(1 + min{P1 , P2 }), with the optimal distribution PX1 ,X2 = N0,P1 N0,P2 . 2 Problem: Can the secrecy upper bound for the general case be improved to (3.8) with the input distribution subject to the moment constraint in (3.10)? n.b. this holds if one could prove quasiconcavity for α in the input distribution. The improvement, if possible, is strict since there exists examples for which the weakening from α to αi in the current proof is strict. e.g. consider the DMMC, 2

Y3 = (X1 , X2 ) ∈ {0, 1}

with active user 1 and 2 being the only transmitters, and untrusted user 3 being the only receiver. The secrecy upper bound (3.8) gives 0 but (3.11) gives 1.

8

B. Secrecy lower bound We now derive single-letter lower bounds on the secrecy capacity using a secrecy scheme called the mixed source emulation, which is motivated partly the pure source emulation approach in [14], and partly by the idea of mixed strategy [16] in zero-sum games. To simplify the discussion here, we will again introduce the mathematical tools separately in the appendices, namely the uniform quantization in ‡B, the minimaxtype lemma in ‡D and the support lemma in ‡E. We will first consider the finitely-valued model where all the input and output symbols of the private DMMC are subject to the finite-alphabet constraint only. The result will then be extended to the more general infinitely-valued model with sample average constraint (2.3) by the usual quantization trick.

the input is chosen as Xit = Xit (U0 , Uit , Ft−1 ).

(3.14)

Given the accumulated public information (U0 , Ft−1 ), the remaining randomness of Xit comes from Uit and is there conditionally independent over i ∈ Dc . Mixed source emulation is a special case of public input adaption without the dependence on Ft−1 in (3.14). 2

Theorem 3.3 (Finitely-valued) For the finitely-valued case where all channel inputs and outputs are subject to the finitealphabet constraint only, we have the secrecy lower bound expressed in terms of β˜ in (F.2), [ ] ˜ PX |Q (·|Q)) (3.15a) Definition 3.2 (Source emulation for finitely-valued model) Cs ≥ max min EQ β(λ, V PQXV =PQXD × λ∈ΛA|D The mixed source emulation approach for the finitely-valued ∏ × i∈Dc PXi |XD Q model is the following specialization of the secrecy scheme ˜ PX ) = min max (mixed) (3.15b) β(λ, in Definition 2.2. V λ∈ΛA|D

Mixed source emulation: Public randomization: User 1 publicly randomizes U0 = (Qn , XnD ) where each symbol (Qt , XDt ) is iid as PQXD over t ∈ [n]. Qn is called an auxiliary iid source component, taking values from an arbitrary finite alphabet. Private randomization: Every trusted user i ∈ Dc privately generates Xni such that its symbol Xit is iid as PXi |XD Q (·|XDt Qt ) over t ∈ [n]. Transmission phase: Xit is sent as the input to the private DMMC at time t ∈ [n] from user i ∈ V . Since the channel input does not adapt to the accumulated knowledge by definition, it is unnecessary to perform any public discussion before the last private channel use at time n. The pure source emulation approach is a special case of the mixed source emulation approach with Q being deterministic.2 This is called the source emulation since the channel input is chosen to be memoryless, which effectively turns the DMMC into a discrete memoryless multiple source (DMMS). The auxiliary random variable Q acts as a mixing random variable that mixes different conditional input distributions PXV |Q in time. It can also be regarded as an auxiliary component source of a dummy untrusted user since Qn is known in public. It gives an additional correlation among the input sequences privately generated by the trusted users. Definition 3.3 (Public input adaptation) If we allow the channel input to adapt to any public information, we have the public input adaptation approach. Public input adaptation: Set the private randomization as, Ui := (Uit : t ∈ [n]) such that Uit ’s are independent over i ∈ V and t ∈ [n]. Then,

≥

×

max

P =PXD × ∏XV i∈D c PXi |XD

˜ PX ). min β(λ, V

P =PXD × λ∈ΛA|D ∏XV × i∈Dc PXi |XD

(pure)

(3.15c)

The bound (3.15b) is the largest (strongly) achievable key rate for the mixed source emulation approach in Definition 3.2, and more generally, any public input adaptation scheme in Definition 3.3. Furthermore, it is admissible to have the alphabet set Q of the auxiliary source component Q satisfy the cardinality bound in (E.1a). The weakened bound (3.15c) is the largest (strongly) achievable key rate for the pure source emulation where Q is chosen to be deterministic. 2 P ROOF The mixed source emulation effectively turns the DMMC into a DMMS where user i ∈ V observes the source (Xi , Yi ) and a dummy untrusted user observes the auxiliary source Q. By [9, Theorem 2]3 , the secrecy capacity for this specialized model is given by (3.15a), which is also strongly achievable. This can be used as a lower bound for the secrecy ˜ PX ) is linear and capacity of the general model. Since β(λ, V continuous in λ over the compact set ΛA|D , we can apply the Minimax-type Lemma D.1 to obtain the secrecy lower bound, 1 ˜ PX ). sup lim inf log|K| ≥ min sup β(λ, V n→∞ n λ∈ΛA|D PX =PX × V D ×

∏

i∈D c

PXi |XD

˜ PX ) is continuous in PX over the compact set Since β(λ, V V P(XV ) due to the finitely-valued model assumption, we can replace sup by max to obtain (3.15b) as desired. With the additional fact that PXV is connected, the range ˜ PX ) : PX ∈ P(XV )} {β(λ, V V is also connected, and so by the Support Lemma E.1, it is admissible to bound the cardinality of Q as in (E.1). It remains to show that the lower bound is the maximum key rate achievable by a public input adaptation scheme. We do so by showing that the secrecy upper bound (3.8) matches 3 Alternatively, apply [14, Theorem 4.1] to the source model. See also [11] for different characterizations of the secrecy capacity.

9

the lower bound under the additional constraint (3.14). First, n it does not loose optimality to reveal (XnD , YD ) in public since they are known to the wiretapper. Thus, we can assume Ft−1 t−1 determines (UD , YD ), and redefine Qt in (3.5) as Qt := (Ft−1 , U0 ). By (3.14), Xit ’s are conditionally independent given Qt . This allows us to impose the additional (conditional) independence condition (F.10) in the secrecy upper bound (3.8), which then matches the lower bound as desired by the equivalence relation (a) in Proposition F.4. Example 3.3 Consider the same model defined in Example 3.1 for two transmitting active users and one receiving untrusted user. Assume in addition that Y is finitely-valued. Then, the secrecy lower bound (3.15b) simplifies to max

PX1 X2 =PX1 PX2

I(X1 ∧ X2 |Y).

n.b. this is similar to the secrecy upper bound in Example 3.1 except that we require the input to be independent instead of subtracting I(X1 ∧ X2 ) from the conditional mutual information. For the binary multiple access channel Y = X1 ⊕ X2 , the optimal input distribution is PX1 X2 = Bern 21 Bern 21 . The secrecy lower bound is 1 bit which matches the upper bound.2 The proof for Theorem 3.3 does not immediately extend to the case when some of the channel inputs and outputs can be infinitely-valued. This is because the secrecy capacity for the source model is achieved in [9] by first attaining omniscience, i.e. the active users recover all the source components asymptotically losslessly. This cannot be done for the continuous-valued components without an appropriate fidelity criterion [33]. The method of types[5] arguments for the strong achievability result also rely on the finite-alphabet constraint. It does not apply directly to infinitely-valued random variables. Fortunately, it is not essential to attain omniscience for the purpose of generating a secret key. We will simply convert the infinitely-valued model to a finitely-valued model by the uniform quantization described in ‡B. We will first extend the secrecy capacity of the finitely-valued source model in [9] to the infinitely-valued source model. This will be used to compose the solution to the infinitely-valued channel model with sample average constraint (2.3). Theorem 3.4 (Infinitely-valued source model) For the source model where the users observe the private DMMS YV , the secrecy capacity is Cs = min β(λ, PYV ) λ∈ΛA|D

(3.16)

and is strongly achievable, where β is defined in (F.3) and YV can be a mixture of discrete and continuous random variables as described in ‡A such that the entropy measure is welldefined with the constraint (A.2). 2 P ROOF The converse follows immediately from Theorem 3.1 because having the private DMMS YV is equivalent to having the private DMMC with PYV |XV = PYV , which implies by (F.1) and (F.3) that α(λ, PXV ) = β(λ, PYV ).

Substituting this into (3.8) gives (3.16) as desired. To show that (3.16) is achievable, consider the following secrecy scheme using the uniform quantization in ‡B. - Each user i ∈ V quantizes its private component source Yi by f∆,b in (B.1) to Zi , i.e. Zi := f∆,b (Yi ). - User i ∈ V broadcasts the indicator χ{Zi ̸= 0} in public. This is equivalent to having a dummy untrusted user 0 that observes the vector of indicators, Z0 := (χ{Zi ̸= 0} : i ∈ V ). Effectively, we have a finitely-valued source model with a dummy untrusted user observing Z0 . Thus, the achievability scheme in [9] can be used. Making ∆ arbitrarily close to 0 and then b arbitrarily large, the achievable secret key rate is [ ] min lim lim EZ0 β(λ, PZV |Z0 (·|Z0 )) λ b→∞ ∆→0 [ ] ≥ min lim lim PZ0 (1) EZ0 β(λ, PZV |Z0 (·|Z0 = 1)) . λ

b→∞ ∆→0

The inequality is because β is non-negative by (C.3a) of the Shearer-type lemma. n.b. PZ0 (1) → 1 in the above limit by (A.2). (See the proof of Lemma B.1 for a detailed derivation.) It suffices now to show that the conditional expectation converges to β(λ, PYV ). Let ℓ(C) for C ⊆ V be the number of continuous components in YC . Since ℓ(·) is a (sub)modular function, the equality case (C.3b) of the Shearer-type lemma applies. i.e. (∑ ) ∑ 0= λB ℓ(B c ) − ℓ(D) − λB − 1 ℓ(V ). B

B

Together with the definition of β in (F.3), we have [ ] EZ0 β(λ, PZV |Z0 (·|Z0 = 1)) ∑ = λB [H(ZB c |Z0 = 1) + ℓ(B c ) log ∆] B

− [H(ZD |Z0 = 1) + ℓ(D) log ∆] (∑ ) λB − 1 [H(ZV |Z0 = 1) + ℓ(V ) log ∆] . − B

By Corollary B.1, lim lim [H(ZC |Z0 = 1) + ℓ(C) log ∆] = H(YC ).

b→∞ ∆→0

Applying this to the previous expression gives β(λ, PYV ).

We now incorporate the sample average constraint (2.3) into the input distribution. To do so, we consider a modified mixed source emulation approach. Definition 3.4 The modified mixed source emulation approach for the sample average constraint is the same as the mixed source emulation approach in Definition 3.2 but with the following modifications. Modifications to mixed source emulation: i) PXV (·|q) is chosen to satisfy the moment constraint (3.10) for all q ∈ Q.

10

ii) Before the transmission phase, if user i ∈ V finds that the sample average constraint is violated for its input sequence Xni , it declares an outage. iii) If any user declares an outage, they skip the transmission phase entirely. In this case, the active users simply generate the individual keys Ki ’s for i ∈ A independently and uniformly randomly. 2

Theorem 3.5 (Sample average constraint) With the sample average constraint (2.3) and the infinitely-valued model, we have the following secrecy lower bound, [ ] ˜ PX |Q (·|Q)) Cs ≥ sup (3.17a) min EQ β(λ, V PQXV =PQXD × λ∈ΛA|D ∏ × i∈Dc PXi |XD Q

= min

λ∈ΛA|D

≥

sup

P =PXD × ∏XV × i∈Dc PXi |XD

sup

˜ PX ) β(λ, V

˜ PX ) min β(λ, V

P =PXD × λ∈ΛA|D ∏XV × i∈Dc PXi |XD

decays to zero exponentially by applying Chernoff bound [23] to the iid input sequence. Thus, the recoverability condition is unaffected as desired. We can therefore ignore modifications ii and iii for the purpose of computing the largest achievable key rate. Applying Theorem 3.4, the maximum key rate strongly achievable is (3.17b) as desired by maximizing over the input distribution satisfying the corresponding moment constraint. The admissibility condition on |Q| follows from the Support Lemma E.1. Example 3.4 Consider the same model defined in Example 3.2 for two transmitting active users and one receiving untrusted user. Then, by (3.17b), the secrecy lower bound is sup

(mixed) (3.17b) (pure)

(3.17c)

where the input distribution is also subject to the moment constraint (3.10). (3.17b) is the largest achievable key rate for the modified mixed source emulation approach in Definition 3.4. It is admissible to have the alphabet set Q of the auxiliary source component satisfy the cardinality bound in (E.1b). (3.17c) is the largest achievable key rate for the corresponding pure source emulation where Q is chosen to be deterministic. 2 P ROOF Consider the modifications i, ii and iii in Definition 3.4. The idea is that imposing the moment constraint in i ensures the sample average constraint is satisfied with high probability such that the outage event in ii almost surely does not occur. For the purpose of the proof, the users do not take further action in case of an outage, and simply generate individual random keys in iii that are independent of everything else, including the wiretapper’s knowledge.4 In other words, the modification does not affect the secrecy condition (2.7) in the sense that if any scheme satisfies the secrecy condition without modification ii and iii, it must also satisfy the condition with the modifications. Next, we will show that the recoverability condition is unaffected and so we can ignore modifications ii and iii for the purpose of computing the largest achievable key rate. Let Er and E˜r be the events that the active users fail to agree on the secret key with and respectively without the modifications. Let Ep be the outage event in ii. Then, Pr(E˜r ) ≥ Pr(E˜r |Epc ) Pr(Epc ) = Pr(Er |Epc ) Pr(Epc )

(3.18)

I(X1 ∧ X2 |Y)

PX1 X2 =PX1 PX2

where the input may be subject to certain moment constraint that corresponds to the sample average constraint. Consider, in particular, the Gaussian multiple access channel Y = X1 + X2 + N, with channel noise N ∼ N0,1 and average power constraint ϕi (xi ) = x2i − Pi for i = 1, 2. We can set the channel input distribution to be Gaussian, PX1 X2 = N0,P1 N0,P2 , which satisfies the power constraints. This gives the following secrecy lower bound, I(X1 ∧ X2 |Y) = H(Y|X1 ) + H(Y|X2 ) − H(Y) − H(Y|X1 X2 ) = H(X2 + N) + H(X1 + N) − H(Y) − H(N) ( ) P1 P2 = log 1 + . P1 + P2 + 1 As P2 /P1 or P1 /P2 → ∞, this approaches the secrecy upper bound log(1 + min{P1 , P2 }) derived in Example 3.2 2 Problem: For the Gaussian multiple access channel considered in Example 3.4, is the Gaussian input distribution globally optimal for the maximization in (3.17b)? It can be shown that Gaussian is a local maximum. To see this, note that X1 independent of X2 implies the following after some algebra. I(X1 ∧ X2 |Y) = H(X1 |X1 + X2 + N) − H(N|X1 + N) = H(X2 |X2 + X1 + N) − H(N|X2 + N). From the first equality, it is optimal to have PX2 = N0,P2 given PX1 = N0,P1 . A similar statement follows from the second equality, which gives the desired local maximality.

IV. M IXED VS PURE SOURCE EMULATIONS

because Pr(E˜r |Epc ) = Pr(Er |Epc ) due to the fact that the modifications are ineffective if there is no outage.

We will give an example for which the secrecy lower bound obtained from mixed source emulation in Definition 3.2 is

Pr(Er ) ≤ Pr(Ep ) + Pr(Er ∩ Epc ) ≤ Pr(Ep ) + Pr(E˜r ) by (3.18).

1) strictly larger than that from pure source emulation, and 2) strictly smaller than the secrecy upper bound.

If Pr(E˜r ) → 0 exponentially for any scheme without the modifications, then Pr(Er ) → 0 exponentially since Pr(Ep ) 4 In

practice, the users can regenerate the input repeatedly until the sample average constraint is satisfied.

The second point implies the following possibilities which will be investigated further in the subsequent sections:

11

TABLE I

, .1. .Y1 = Y3 = N3

, .1. .Y1 = N1

, .3

, .1. .Y1 = N1

, .3 .Y2 = Y3 = N3

. 2 = N2 Y , .2

, .2

(a) X1 = X2 = 0

(b) X1 = X2 = 1

S ECRET KEY

RATES FOR COUPLING CHANNEL

key rate (bits)

, .3 .Y3 = N3

. 2 = N2 Y , .2

optimal pure source emulation (p.s.e.) variant of source emulation optimal mixed source emulation (m.s.e.) secrecy upper bound

Rpse ≈ 0.41 Rvar = 0.5 Rmse ≈ 0.54 Csu ≈ 0.60

(c) X1 ̸= X2

Fig. 4. Coupling channel: each user, denoted as ,, 1 ,, 2 ,, 3 observes one of the independent random bits N1 , N2 , N3 depending on the channel inputs.

according to PX1 X2 (x1 , x2 ) = Bernp (x1 ) Bern1−p (x2 ), with p ≈ 0.44 i.e. X1 and X2 are independent Bernoulli random variables.

- The secrecy lower bound may be loose, in which case one can somehow improve the key rate by private input adaptation: adapting the channel input to the accumulated observations over time, or interactive public discussion: adapting the public message to the previous public messages. - The secrecy upper bound may be loose and can perhaps be improved by information-theoretic tools. Consider three users consisting of two active users and one trusted helper, i.e. A = [2] ⊆ Dc = V = [3], and the DMMC is defined below. Coupling channel: user input output

1 X1 ∈ {0, 1} Y1 ∈ {0, 1}

2 X2 ∈ {0, 1} Y2 ∈ {0, 1}

3 Y3 ∈ {0, 1}

The output bits are defined as follows, Y3 := N3 { N3 if X1 = X2 = 0 Y1 := N1 otherwise { N3 if X1 = X2 = 1 Y2 := N2 otherwise.

(4.1a) (4.1b) (4.1c)

where N1 , N2 , N3 are uniformly independent random bits mutually independent of the channel input bits X1 , X2 . As illustrated in Fig. 4, the active users control jointly the coupling of the observations as follows: Y1 couples with Y3 if the input bits are 0; Y2 couples with Y3 if the input bits are 1; the output bits are all independent if the input bits disagree. It is therefore beneficial for the active users to coordinate their inputs to enhance their correlation. Table I summarizes the achievable key rates for the different secrecy schemes described below. The detailed computation can be found in ‡G in the appendices. Optimal pure source emulation: The active users generate the channel input sequence iid

Optimal mixed source emulation: The auxiliary source and channel input sequence are generated iid according to PQ = Bern 12 { 2 (x2 ) Bern0 (x1 ) Bern 17 PX1 ,X2 |Q (x1 , x2 |q) = Bern 15 (x ) Bern 1 1 (x2 ) 17

if q = 0, if q = 1

i.e. X1 and X2 are conditionally independent Bernoulli random variables given the uniformly random bit Q. Unfortunately, explicit choices of the public discussion and key functions that attain the corresponding secrecy lower bounds in Theorem 3.3 are not known even for this particular example. Only random ensembles are used in the proof of achievability. To help us understand more concretely why the optimal mixed source emulation approach outperforms the pure source emulation approach, we consider the following variant scheme for which the public discussion and key functions are completely specified. Variant scheme: The active users set their channel inputs equal to the parity of the time t, i.e. { 0 if t is odd X1,t = X2,t = 1 if t is even. This can be considered as a trivial public input adaptation approach defined in Definition 3.3 since the users adapt the input only to trivial public information, namely t. During the public discussion, user 3 broadcasts the XOR bits F = (Y3,t ⊕ Y3,(t+1) : t is odd). The key is chosen to be K = (Y3,t : t is odd). which is uniformly distributed and independent of F as desired. Furthermore, K is observed by user 1 through Y1,t for odd t, and perfectly recoverable by user 2 using the bitwise XOR F ⊕ (Y2,(t+1) : t is odd) = K

12

because Y2,(t+1) = Y3,(t+1) for odd t. The key rate achieved is therefore 0.5 bits as shown in Table I. We now relate this variant scheme to each source emulation approach to show that mixed source emulation outperforms pure source emulation by the additional coordination through the auxiliary component source. Consider the 2-extended coupling channel defined as even PY¯ V |X¯ V ((yVodd , yVeven )|(xodd )) V , xV even even = PYV |XV (yVodd |xodd |xV ). V )PYV |XV (yV

i.e. each channel use corresponds to two simultaneous uses of the original coupling channel PYV |XV . The variant scheme can then be considered as a pure source emulation scheme with n/2 uses of the 2-extended coupling channel, and the ¯ i = (0, 1) for i ∈ V . The improvement on trivial iid input X the original pure source emulation approach comes from the additional coordination through the 2-block memory. To show that the same coordination can come from the auxiliary component source instead, consider the following mixed source emulation. X1 = X2 = Q ∼ Bern 12 . By the large deviation theory, the fraction of time where the input bits are 0 is arbitrarily close to 1/2 with probability exponentially converging to 1. The same holds for the case with input bits equal to 1. Let Y(q) for q ∈ {0, 1} be vectors of n( 12 − δn ) output bits Y3,t at disjoint time t, with as many of them satisfying X1,t = X2,t √ = q as possible, and δn → 0 at sufficiently slow rate, say 1/ n. User 3 reveals the following element-wise XOR bits in public, F = Y(0) ⊕ Y(1) . The key is chosen to be K = Y(0) , which is independent of F and Qn . By the large deviation theory, Y(q) almost surely consists of only bits at time t where X1,t = X2,t = q. Thus, Y(0) is almost surely observed by user 1 through Y1,t at time t where X1,t = X2,t = 0 and recoverable by user 2 from the public message and his private observation Y2,t at time t where X1,t = X2,t = 1. This mixed source emulation is therefore almost surely the same as the variant scheme under a reordering of the time index. It achieves the same coordination that improves the pure source emulation approach, but with the auxiliary component source instead of the 2-block memory. Problem: Is the maximum key rate achievable by pure source emulation with block memory the same as that achievable by mixed source emulation? V. N ECESSARY C ONDITION FOR T IGHTNESS We first introduce a weaker notion of optimality, called the optimality of single-letter form, without which the secrecy bounds cannot be tight. Roughly speaking, we say that a single-letter bound is single-letter optimal if multi-letterizing it does not improve the bound.

Multi-letterization by channel extension: Given a function f on the DMMC PYV |XV and a positive integer k ∈ P, the k-letter form of f is defined as f (k) (PYV |XV ) :=

1 f (PYkV |XV ) k

(5.1)

where PYkV |XV is the k-extension of PYV |XV defined as, ∏ PYkV |XV (yVk |xkV ) := PYV |XV (yV τ |xV τ ) (5.2) τ ∈[k]

with any additional constraints on the input XV such as the moment constraint (3.10) translated directly to the constraints on XV [k] . Definition 5.1 (Single-letter optimality) A function f of DMMC PYV |XV is called single-letter maximal or a singleletter optimal lower bound (single-letter minimal or a singleletter optimal upper bound) if f is no smaller (no larger) than its k-letter form f (k) defined in (5.1) for all k ∈ P. 2 ˜ αi and γ defined in Any of the secrecy expressions α, β, ‡F maximized over any set of input distributions is singleletter minimal because the k-letter form is equal to the singleletter form when we impose an additional memorylessness constraint on the k-letter input distribution PXV [k] that ∏

PXV [k] =

P XV τ .

(5.3)

τ ∈[k]

Thus, the single-letter secrecy upper bounds characterized in α and αi are no larger than their multiletter form. i.e. the secrecy upper bounds are single-letter optimal, and are therefore potentially tight. Theorem 5.1 (Single-letter minimality) The secrecy upper bounds in Theorem 3.1 and 3.2 are single-letter optimal. 2 P ROOF With the additional memorylessness constraint (5.3) on the k-letter input distribution, we have for all λ ∈ ΛA|D and i ∈ Dc that ∑ 1 1 (a) sup αi (λ, PXV [k] ) = sup αi (λ, PXV τ ) k PXV [k] k PXV [k] Pk YV |XV

τ ∈[k]

1 ∑ = sup αi (λ, PXV τ ) k PXV τ

(b)

τ ∈[k]

1 = sup αi (λ, PXV ) k PXV where (a) follows from the definitions (F.4) and (5.2), and (b) follows from the fact that the moment constraint (3.10) is imposed on each marginal input distribution PXV τ . Taking the infimum on both sides over λ and i gives the desired equivalence of the k-letter form on the left and the single-letter form on the right for the secrecy upper bound in Theorem 3.2. Applying similar arguments for α instead of αi , the secrecy upper bound in Theorem 3.1 is also single-letter optimal. Theorem 5.2 (Single-letter maximality) The secrecy lower bounds in Theorem 3.3 and 3.5 are single-letter optimal if the

13

DMMC PYV |XV satisfies the single-leakage condition (F.11) that PYD |XV = PYD |XD∪{s} for some s ∈ Dc . 2 P ROOF The input distribution PXV [k] for the k-letter form (5.1) of the secrecy lower bounds (3.15b) and (3.17b) satisfies the conditional independence condition (F.10) ∏ PXV [k] = PXD[k] PXi[k] |XD[k] . i∈D c

Since the k-extension of the DMMC also satisfies the singleleakage condition, we have, by Proposition F.4, that 1 ˜ 1 β(λ, PXV [k] ) k = γ(λ, PXV [k] ) P k k k PY |X YV |XV V V ∑ 1 (a) γ(λ, PXV τ ) = k τ ∈[k]

(b)

=

1 ∑ ˜ β(λ, PXV τ ) k τ ∈[k]

where (a) is by Proposition F.5, (b) is again by Proposition F.4. Maximizing over PXV [k] as a function of λ and minimizing over λ on both sides give the desired equivalence of the kletter form on the left and single-letter form on the right, as in the proof of Theorem 5.1. Problem: Construct an example, if any, for which the secrecy lower bounds are not single-letter optimal.

We will show that pure source emulation achieves the secrecy capacity of 1 bit. First, λ = 1 (i.e. λ{1} = λ{2} = 1) is the only fractional partition in ΛA|D , and so we have equality for (3.15c). That means pure source emulation is optimal if mixed source emulation is. Next, we prove the sufficient condition (6.1) as follows. By definition (F.1b), =0

z }| { α(1, PX1 X2 ) = H(Y|X1 ) + H(Y|X2 ) − H(Y) − H(Y|X1 X2 ) = H(Y|X1 ) − I(X2 ∧ Y) ≤ H(Y) ≤ 1. The inequalities are achievable with equalities by independently and uniformly distributed inputs PX1 X2 = Bern 12 (x1 ) Bern 12 (x2 ). By Theorem 6.1, we have the desired optimality of source emulation. Furthermore, there is a practical way to attain the secrecy capacity non-asymptotically with n = 1: have user 3 reveal Y in pubic and choose X1 as the key. User 2 can perfectly recover X1 = Y ⊕ X2 , which is perfectly secret since it is independent of Y. 2 For the infinitely-valued model with sample average constraints, we have the following sufficient condition instead. Theorem 6.2 (Sample average) The secrecy lower bound (3.17b) matches the upper bound (3.11) if the channel PYV |XV satisfies ∃ s ∈ Dc , ∀ λ ∈ ΛA|D ,

Theorem 6.1 (Finitely-valued) For finitely-valued private channel, the secrecy lower bound (3.15b) in Theorem 3.3 matches the secrecy upper bound (3.8) in Theorem 3.1 if the channel PYV |XV satisfies ∀λ ∈ ΛA|D ,

max α(λ, PXV ) = PXV

max

α(λ, PXV ). (6.1)

P =PXD × ∏XV × i∈Dc PXi |XD

P =PXD × ∏XV × i∈Dc PXi |XD

α(λ, PXV ) =

max

P =PXD × ∏XV × i∈Dc PXi |XD

˜ PX ). β(λ, V

After minimizing over λ ∈ ΛA|D , the R.H.S. gives the secrecy lower bound (3.15b), which equals the secrecy upper bound (3.8) given by the L.H.S. of (6.1). Example 6.1 (Binary MAC) Consider two active transmitting users and one untrusted receiving user who observes Y = X1 ⊕ X2 . i.e. A = [2] = Dc ( V = [3].

γ(λ, PXV ) (6.2b)

where the input distribution can be subject to the sample average constraint (2.3). 2 P ROOF Consider bounding the supremum in the secrecy upper bound (3.11) as follows, (a)

sup min αi (λ, PXV ) = sup γ(λ, PXV )

c PXV i∈D

PXV (b)

=

(c)

=

PXV =PXD

P ROOF With conditionally independent channel input, α = β˜ by Proposition F.4 and so

sup

(6.2a)

P =PXD × ∏XV × i∈Dc PXi |XD

PXV =PXD

i.e. α is maximized by conditionally independent channel inputs given the inputs of the untrusted users. 2

max

sup γ(λ, PXV ) = PXV

VI. S UFFICIENT C ONDITIONS FOR T IGHTNESS In this section, we will first derive some general sufficient conditions for tightness of the secrecy bounds. In §VI-A and §VI-B, we will give more explicit classes of channels that satisfy the conditions.

PYD |XV = PYD |XD∪{s}

sup ∏ i∈D c

sup ∏ i∈D c

γ(λ, PXV ) PXi |XD

˜ PX ) β(λ, V PXi |XD

where (a) Under the single-leakage condition (6.2a), mini∈Dc αi ≤ αs γ by Proposition F.4. The reverse inequality follows from (F.9b). (b) This is by condition (6.2b). (c) With conditionally independent input, γ = β˜ again by Proposition F.4. Minimizing over λ ∈ ΛA|D gives the secrecy upper bound on the L.H.S. and the lower bound on the R.H.S. as desired. Example 6.2 (Gaussian channel) Consider the following two-user gaussian channel PYV |XV with A = [2] = Dc = V , Y1 = h11 X1 + h12 X2 + N1

(6.3a)

Y2 = h21 X1 + h22 X2 + N2

(6.3b)

14

where all variables are real-valued, and N1 and N2 are arbitrary zero-mean jointly gaussian noises normalized to have unit variance, i.e. [ ] n PN1 N2 (n1 , n2 ) = N0,[ 1 ρ ] (n) ∀ n = 1 ∈ R2 (6.4) n2 ρ 1

where C1 and C2 are the capacities of the component channels from user 1 to user 2 and user 2 to user 1 respectively after removing the interference. Hence, the users can directly transmit independent secret key bits at the capacities of the respective channels.5 2

where Nµ,Σ denotes the jointly gaussian distribution [23] with mean µ and covariance matrix Σ, 1 − 12 (x−µ)T Σ−1 (x−µ) Nµ,Σ (x) := . (6.5) 1 e n (2π) 2 |Σ| 2

Note that Theorem 6.2 also applies to the finitely-valued model as a special case with or without sample average constraints. The sufficient condition is not as general as that in Theorem 6.1 because (6.2) implies (6.1) but the converse is not true.6 The additional single-leakage condition (6.2a) essentially turns condition (6.1) to (6.2b) by the equivalence relation (b) of (F.12) in Proposition F.4. (6.2b) is easier to work with, however, because of the concavity of γ in the input distribution by Proposition F.2. For instance, we can use this to derive the following tightness condition for simultaneous independent channels.

In addition, the channel input sequences Xn1 and Xn2 for user 1 and user 2 are subject to the average power constraints 1 ∑ 2 1 ∑ 2 X1t ≤ P1 and X2t ≤ P2 n n t∈[n]

t∈[n]

which translate to the following power constraints on the secrecy bounds in Theorem 3.2 and Theorem 3.5. E(X21 ) ≤ P1

and

E(X22 ) ≤ P2 .

(6.6)

We will show that pure source emulation is optimal and compute the secrecy capacity. Since D = ∅, the condition (6.2a) is satisfied trivially. Evaluating (F.5b) with the only fractional partition λ = 1,

Theorem 6.3 (Simultaneous independent channels) Suppose the channel consists of a finite set L := [ℓ] of simultaneous independent channels in (F.13), i.e. ∏

PYV |XV =

PYjV |XjV .

j∈L

γ(1, PXV ) = H(Y1 |X1 ) + H(Y2 |X2 ) − H(Y1 Y2 |X1 X2 ). Then, the secrecy lower bound (3.17b) matches the upper bound (3.11) if the channel PYV |XV satisfies

The last entropy term is H(Y1 Y2 |X1 X2 ) = H(N1 N2 ) = log 2πe(1 − ρ2 ). The remaining terms can be bounded as follows,

∃ s ∈ Dc ,

H(Y1 |X1 ) = H(h12 X2 + N1 |X1 ) ≤ H(h12 X2 + N1 ) 1 ≤ log 2πe(h212 P2 + 1) 2 by the fact that gaussian distribution maximizes entropy for a given variance [23]. Similarly, 1 H(Y2 |X2 ) ≤ log 2πe(h221 P1 + 1). 2 All the inequalities are satisfied with equality by the gaussian input distribution PX1 X2 (x1 , x2 ) = N0,P1 (x1 )N0,P2 (x2 ) ∀ x1 , x2 ∈ R

1 (h212 P2 + 1)(h221 P1 + 1) ln . 2 (1 − ρ2 )2

This is also equal to (3.17c) since the optimal input distribution is independent of λ. Pure source emulation turns out to be optimal. It is possible to attain this secrecy capacity without public discussion if ρ = 0. To argue this, note that the secrecy capacity can be rewritten as a sum of two channel capacities as follows, C2

CGC

C1

z }| { z }| { 1 1 2 2 = ln(h12 P2 + 1) + ln(h21 P1 + 1) 2 2

(6.8a)

∀ λ ∈ ΛA|D , j ∈ L, sup γ(λ, PXjV ) = PXjV

sup

γ(λ, PXjV )

PX =PXjD × ∏ jV × i∈Dc PXji |XjD

(6.8b)

where the input distribution is subject to the sample average constraint (2.3). Furthermore, the secrecy capacity can be achieved by the modified mixed source emulation in Definition 3.4 with conditionally independent inputs for different channels given an auxiliary source, i.e.

(6.7)

which therefore maximizes γ. Hence, Theorem 6.2 implies that the secrecy capacity is given by (3.17b), CGC := γ(1, N0,P1 N0,P2 ) =

PYD |XV = PYD |XD∪{s}

PXV |Q =

∏

PXjV |Q

(6.9)

j∈L

where Q is the auxiliary source.

2

P ROOF We first show that it is optimal to have the following independence constraint P XV =

∏

PXjV

(6.10)

j∈L

5 This may not belong to the source emulation approach since the channel inputs may not be iid over time. 6 e.g. Example 6.1 does not satisfy the single-leakage condition (6.2a).

15

directly transmitting the secret key from user 1 to 2 and relaying it from 2 to 3.7 2

for the maximization in the secrecy upper bound (3.11). (a)

sup min αi (λ, PXV ) = sup γ(λ, PXV )

c PXV i∈D

P XV

(b) ∑

≤

j∈L

sup γ(λ, PXj V ) P

YjV |XjV

PXjV

˜ = β(λ,P XV )| (d)

(c) ∑ = j∈L

sup

×

(e)

PXV =PXD

sup ∏ i∈D c

{

YjV |XjV

PX =PXjD × ∏ jV i∈D c PXji |XjD

≤

PY jV |XjV

z }| γ(λ, PXj V ) P

˜ PX ) β(λ, V PXi |XD

(a) Under the single-leakage condition (6.2a), mini∈Dc αi ≤ αs γ by Proposition F.4. The reverse inequality follows from (F.9b). (b) by (F.14b) in Proposition F.5. Equality is achievable by independent inputs (6.10). (c) by the sufficient condition (6.8b). (d) by (c) of (F.12). (e) because independent inputs (6.10) achieves ∑ ˜ PX ) = ˜ PX ) β(λ, β(λ, V V PYjV |XjV

j∈L

˜ Relaxing this independence by the definition (F.2) of β. gives the upper bound. Finally, minimizing over λ ∈ ΛA|D gives the desired tightness because the L.H.S. of (a) and the R.H.S. of (e) become the secrecy upper and lower bounds respectively. Furthermore, the ˜ equality for (e) implies (6.10) is optimal in maximizing β. Example 6.3 (Noise-free network) Consider the following finitely-valued noise-free network for three users that are all active. i.e. A = V = [3]. user input output

1 X1 ∈ X1

2 X 2 ∈ X2 Y2 = X1

3 Y3 = X2

There is a noiseless channel from user 1 to user 2, and an independent one from user 2 to user 3. D = ∅ implies (6.8a). Since each component channel has only one sender, conditional independent input trivially maximizes γ for each channel, i.e. (6.8b). Hence, source emulation is optimal by Theorem 6.3. By (3.15b), the secrecy capacity is ∑ CN := min max λB H(YB c |XB c ) λ∈ΛA|D PXV =PX1 PX2

= min

λ∈ΛA|D

∑

B∈HA|D

λB log|XB c |

B∈HA|D

= min{log|X1 |, log|X2 |}. Since the optimal input distribution, namely the uniform distribution, is independent of λ, pure source emulation is optimal. There is also a practical scheme to achieve the capacity by

In general, there is a super-additive gain in secrecy capacity for simultaneous independent channels, i.e. the secrecy capacity of the composite channel is no less than that of each component channel. Example 6.3 for instance has 0 secrecy capacity for each component channel (since at least one user is isolated from the others in each case), but the secrecy capacity for the composite channel is positive. Note that we can also combine very different channels together, such as adding the continuous-valued channel in Example 6.2 to the finitely-valued channel in Example 6.3. We cannot add the channel from Example 6.1 however since it does not satisfy the single-leakage condition. A trivial condition for tightness is when the secrecy upper bound is 0. This happens under the following sufficient condition. Theorem 6.4 (Zero secrecy capacity) The secrecy capacity is zero if the channel PYV |XV is such that there exists a bipartition {C1 , C2 } of Dc through A, i.e. C1 , C2 ̸⊇ A,

C1 ∩ C2 = ∅,

C1 ∪ C2 = Dc

such that PYV |XV = PYC1 ∪D |XC1 ∪D PYC2 |XC2 ∪D .

(6.11)

In other words, we have the Markov chain, YC1 ∪D ↔ XC1 ∪D ↔ XC2 ∪D ↔ YC2 , regardless of the choice of PXV . 2 P ROOF Given (6.11) is satisfied, consider some i′ ∈ C1 and λ′ ∈ ΛA|D with { 1 if B = C1 or C2 ′ λB = 0 otherwise. By (3.11), the secrecy capacity is upper bounded by Csu = min sup minc αi (λ, PXV ) λ∈ΛA|D PX i∈D V

≤ sup αi′ (λ′ , PXV ) PX

V [ = sup H(YC2 |XC2 ∪D YD )

PXV

] + H(YC1 ∪D |XC1 ∪D ) − H(YV |XV ) where the last equality is by (F.4c). Applying (6.11) to the first two entropy terms, H(YC2 |XC2 ∪D YD ) = H(YC2 |XV YC1 ∪D ) H(YC1 ∪D |XC1 ∪D ) = H(YC1 ∪D |XV ). The sum of the entropies above is H(YV |XV ), and so the secrecy upper bound is at most zero as desired. In the special case when we have a source model instead, i.e. no channel input, the condition in Theorem 6.4 is also necessary. 7 This does not belong to the source emulation approach since it involves memory in the input for relaying. However, it is also optimal to first convert it to a source by the result in §VI-B, and then attain the secrecy capacity perfectly by the network coding approach [18].

16

Theorem 6.5 (ZSC for source model) For the (possibly infinitely-valued) source model, secrecy capacity is zero if and only if the source PXV is such that there exists a bipartition {C1 , C2 } of Dc through A with

Proposition 6.1 (Single trusted sender) If the channel has at most one trusted sender, i.e. for some s ∈ Dc ,

PXDc |XD = PXC1 |XD PXC2 |XD

then pure source emulation is optimal and the secrecy capacity is given by (3.17c) in Theorem 3.5. 2

(6.12)

or equivalently the Markov chain XC1 ↔ XD ↔ XC2 .

2

P ROOF Sufficiency follows from Theorem 6.4 by treating XV as the output of a channel that does not admit any input. Consider proving the converse. Let λ∗ be the optimal solution to the minimization in the secrecy capacity (3.16) Cs = min β(λ, PXV ). λ∈ΛA|D

We may assume that there exists a bipartition {C1 , C2 } of Dc ∑ ∗ through A such that λ∗C1 > 0 since B∈HA|D λB ≥ 1 by (C.2). If the secrecy capacity is zero, i.e. β(λ, PXV ) = 0, then we have by (F.3) that ∑ H(XDc |XD ) = λ∗B H(XB |XB c ). B

Note that the L.H.S. is no smaller than the R.H.S. in general by the weak form (C.3a) of the Shearer-type lemma. From the proof of the lemma, the above equality implies that, for all B ∈ HA|D with λ∗B > 0, ∑ ∑ H(Xi |X[i−1] XD ) = H(Xi |X[i−1]∩B XB c ). i∈B

i∈B

This holds in particular for C1 that ∑ ∑ H(Xi |X[i−1] XD ) = H(Xi |X[i−1]∩C1 XC2 ∪D ). i∈C1

i∈C1

Without loss of generality, we can re-index the users such that C1 = [c] ⊆ Dc for some positive integer c. Then, we have [i−1] ⊆ C1 for all i ∈ C1 , and so the above equality becomes H(XC1 |XD ) = H(XC1 |XC2 ∪D ) which gives (6.12) as desired.

Problem: Give an example, if any, for which pure source emulation is strictly suboptimal even though one of the tightness conditions described above is satisfied.

A. Interference-free Channels In the following, we show that the secrecy bounds are tight for channels that are interference-free, i.e. without mixing of channel inputs from different senders. We first consider DMMC’s that have at most one trusted sender, who can also receive a channel output as an immediate feedback. Furthermore, the channel can be infinitely-valued with sample average constraints on the input. We will show that pure source emulation achieves the secrecy capacity in this case, extending the result of [14] to a more general class of channels.

PYV |XV = PYV |XD∪{s}

(6.13)

P ROOF This follows immediately from Theorem 6.2 that (3.17b) is the secrecy capacity since the single-trusted-sender condition (6.13) trivially implies both the single-leakage condition (6.2a) and the optimality of conditionally independent input distribution (6.2b). To prove the stronger result that pure source emulation is optimal, i.e. the secrecy capacity is (3.17c), it suffices to consider the finitely-valued model because the achieving scheme for Theorem 3.5 first convert the infinitelyvalued model to finitely-valued model by quantization (B.1). By the minimax theorem [34], min

˜ PX max β(λ, ) = max D∪{s}

λ∈ΛA|D PXD∪{s}

˜ PX min β(λ, ) D∪{s}

PXD∪{s} λ∈ΛA|D

since β˜ (or equivalently γ by Proposition F.4) is concave in the input distribution PXD∪{s} by Proposition F.2 and linear in λ over convex compact sets.8 The L.H.S. is the secrecy capacity by Theorem 6.2 while the R.H.S. is the secrecy lower bound (3.15c) achievable by pure source emulation. Roughly speaking, we can think of the channel equivalently as a broadcast-type channel with immediate feedback to the sender, and with a channel state, namely XD , publicly controllable by the untrusted terminals. Since coordination is trivial with just one sender, even pure source emulation achieves the secrecy capacity. This is an immediate generalization of [14]. By Theorem 6.3, we can extend the result further to a strictly larger class of channels that can be decomposed into simultaneous independent broadcast-type channels as follows. Proposition 6.2 (Single trusted sender per channel) If every simultaneous independent channel has one trusted sender (not necessarily the same one), and at most one channel has output observable by the untrusted terminals, i.e. ∏ PYV |XV = PY1V |XD∪{s1 } PYjDc |Xj D∪{sj } (6.14) j∈L\{1}

with sj ∈ Dc for all j ∈ L, then pure source emulation is optimal and achieves the secrecy capacity (3.17c). n.b. Proposition 6.1 is a special case when |L| = 1. 2 P ROOF This follows immediately from Theorem 6.3 with the same argument as in the proof of Proposition 6.1. Since each simultaneous independent channel has at most one trusted sender, conditionally independent input is trivially optimal for each channel, giving (6.8b). (6.8a) follows from the definition (6.14) that YD = Y1 D which depends on the input PXV only through Xs1 . To prove that pure source emulation is optimal, it suffices to consider the finitely-valued model 8 The set of valid input distributions remains convex under the sample average constraints. If there were more than one trusted sender, however, the set of conditionally independent input would not be convex.

17

as argued in the proof of Proposition 6.1. Denote the vector (PXj D∪{sj } : j ∈ L) by (PXj D∪{sj } )j∈L , and define ˜ PX f (λ, (PXj D∪{sj } )j∈L ) := β(λ, ) 1 D∪{s1 } PY |X 1V 1 D∪{s1 } ∑ ˜ PX + β(λ, ) j D∪{sj } PYjDc |Xj D∪{s

j∈L\{1}

j}

which is linear in λ over the convex compact set ΛA|D and concave in (PXj D∪{sj } : j ∈ L) over the convex compact set of vectors of valid input distributions for the independent channels. By the minimax theorem [34], min

max

λ∈ΛA|D PXj D∪{s

=

j}

:j∈L

max

PXj D∪{s

j

f (λ, (PXj D∪{sj } )j∈L ) min f (λ, (PXj D∪{sj } )j∈L ).

:j∈L λ∈ΛA|D }

I) Consider the case when |B| = 2. For definiteness, suppose B = {1, 2}. If s3 = 3, we have H(Y3 |X3 ) = H(Y3 , N3 |X3 ) = H(N3 |X3 ) = H(N3 )

Proposition 6.3 (Three correlated channels) Consider the three-user case where A ⊆ Dc ⊆ V = [3]. One of the users can be a helper or an untrusted user. i.e. we may have A = [2] and D = ∅ or {3}. Suppose the channel PYV |XV satisfies Yi = fi (Xsi , Ni )

∀i ∈ V

(6.15)

where, for all i ∈ V , si ∈ V , the channel noise Ni ’s are independent of Xi ’s and recoverable from the corresponding channel input and output in the sense that there exists functions gi ’s with Ni = gi (Xsi , Yi )

∀i ∈ V.

(6.16)

Then, mixed source emulation is optimal and the secrecy capacity is given by (3.17b). If |YD | > 1, then pure source emulation is optimal. If |YD | ≤ 1 and the channel noises are independent of each other, then pure source emulation is optimal with input distribution PX1 X2 X2 = PX∗1 PX∗2 PX∗3 where PX∗s maximizes H(Yi ) under the corresponding moment coni straints. 2 P ROOF The single-leakage condition (6.2a) is satisfied trivially if D = ∅ or |YD | ≤ 1. If not, consider for definiteness that D = {3}. By (6.15), Y3 = f3 (Xs3 , N3 ) which is independent of XV given Xs3 since N3 is independent of XV . We have the desired Markov chain for (6.2a) that Y3 ↔ Xs3 ↔ XV . To show (6.2b), consider the case D = ∅ first. By (F.5b), ∑ γ(λ, PXV ) = λB H(YB c |XB c ) − H(YV |XV ). B

By (6.15) and (6.16), H(YV |XV ) = H(NV |XV ) = H(NV ) independent of PXV . Thus, it suffices to show as follows that H(YB c |XB c ) is maximized by independent inputs for every choice of B ∈ HA|D .

by (6.15) by independence

which is trivially maximized by independent inputs. If s3 ∈ {1, 2} instead, then H(Y3 |X3 ) ≤ H(Y3 ) with equality if Xs3 is independent of X3 by the data processing theorem and the Markov chain Y3 ↔ Xs3 ↔ X3 from (6.15) that I(Y3 ∧ X3 ) ≤ I(Xs3 ∧ X3 ) = 0. Thus, we have sup H(Y3 |X3 ) ≤ sup H(Y3 )

The L.H.S. is the secrecy capacity by Theorem 6.3, while the R.H.S. is the secrecy lower bound (3.15c). In some cases, it is also possible to further allow correlation among different component channels. An example is the following three-user case with correlated noises for the component channels, which is not covered by Proposition 6.2.

by (6.16)

PX1 X2 X3

PXs

3

since PY3 depends on PXV only through PXs3 . Thus, it is optimal to choose PX∗s that maximizes H(Y3 ). 3 II) Consider the case |B| = 1. For definiteness, suppose B = {1}. a) Suppose s2 = s3 = 1, which gives Y2 Y3 ↔ X1 ↔ X2 X3 . Then, by the same argument as before, sup H(Y2 Y3 |X2 X3 ) = sup H(Y2 Y3 ) PX1 X2 X3

PX1

and so it is optimal to have independent inputs. b) Suppose s2 , s3 ∈ {2, 3} instead. Again by (6.15) and (6.16), H(Y2 Y3 |X2 X3 ) = H(Y2 Y3 N2 N3 |X2 X3 ) = H(N2 N3 |X2 X3 ) = H(N2 N3 ) which is trivially maximized by independent inputs. c) The remaining case has exactly one of s2 and s3 equal to 1. For definiteness, suppose s2 ∈ {2, 3} and s3 = 1. H(Y2 Y3 |X2 X3 ) = H(Y2 |X2 X3 ) + H(Y3 |X2 X3 Y2 ) = H(N2 ) + H(Y3 |X2 X3 N2 ) (6.17) again by (6.15) and (6.16). The second entropy can be bounded as follows, H(Y3 |X2 X3 N2 ) ≤ H(Y3 |N2 )

(6.18)

with equality if X1 is independent of (X2 , X3 ). To argue this, note that we have the Markov chain Y3 N2 ↔ X1 ↔ X2 X3 since (Y3 , N2 ) is a function of (X1 , N2 , N3 ) by (6.15). By the data processing theorem, I(Y3 ∧ X2 X3 |N2 ) ≤ I(Y3 N2 ∧ X2 X3 ) ≤ I(X1 ∧ X2 X3 ) = 0. Hence, applying (6.18) to (6.17) and maximizing over the input distribution, we have sup H(Y2 Y3 |X2 X3 ) = H(N2 ) + sup H(Y3 |N2 ). PX1 X2 X3

PX1

18

If N3 is independent of N2 in addition, then H(Y3 |N2 ) = H(Y3 ) and so it is optimal to choose PX∗1 that maximizes H(Y3 ). Thus, by Theorem 6.2, the secrecy capacity is given by (3.17b) and so mixed source emulation is optimal. If the channel noises are independent, it is optimal to choose PX1 X2 X3 = PX∗1 PX∗2 PX∗3 , which is independent of the choice of λ. (3.17c) is therefore satisfied with equality and so pure source emulation is optimal. Consider proving (6.2b) for the remaining case when |YD | > 1. Assume for definiteness that A = [2] and D = {3}. Since the only fractional partition in ΛA|D is λ = 1, we need only consider the following by (F.5b). γ(1, PXV ) = H(Y1 |X1 X3 Y3 ) + H(Y2 |X2 X3 Y3 ) − H(Y1 Y2 |X1 X2 X3 Y3 ). By (6.15) and (6.16), H(Y1 Y2 |X1 X2 X3 Y3 ) = H(N1 N2 |X1 X2 X3 N3 ) = H(N1 N2 |N3 ) which is trivially maximized by independent inputs. It suffices to show that H(Y1 |X1 X3 Y3 ) is maximized by independent inputs as follows. The same conclusion will apply to H(Y2 |X2 X3 Y3 ) by symmetry. 1) Consider the case s1 = s3 = 2. This gives the Markov chain Y1 Y3 ↔ X2 ↔ X1 X3 . Thus,

noises Ni ’s are correlated and jointly gaussian (6.5), and the channel inputs Xi ’s are subject to the power constraints Pi ’s, it can be shown that it is optimal to have independent gaussian input, i.e. PX1 X2 X3 = N0,diag(P1 ,P2 ,P3 ) . Thus, pure source emulation is optimal in this case even though the noises are not necessarily independent. 2 B. Finite homomorphic network In this section, we show that the secrecy upper and lower bounds can be tight for another class of channels that are not necessarily interference-free, such as the binary MAC in Example 6.1. In particular, we consider a class of channels the outputs of which are group homomorphisms of the channel inputs and noises. It can be viewed as a generalization of finite linear networks in [35] for network coding, where the outputs are linear combinations of the inputs over some finite fields. We will show that pure source emulation is optimal with uniform input distribution and give an explicit expression for the secrecy capacity. For finite linear networks, in particular, the secrecy capacity can be attained perfectly by network coding techniques [18]. Extension is also possible to the more general partly directed networks in [12]. Definition 6.1 (Finite homomorphic network) The channel output YV depends on the input XV as follows,

H(Y1 |X1 X3 Y3 ) ≤ H(Y1 |Y3 ) with equality if X2 is independent of (X1 , X3 ). 2) If s1 , s3 ∈ {1, 3} instead, then H(Y1 |X1 X3 Y3 ) = H(N1 |X1 X3 N3 ) = H(N1 |N3 ) which is trivially maximized by independent inputs. 3) If s1 = 2 and s3 ∈ {1, 3} instead, then H(Y1 |X1 X3 Y3 ) = H(Y1 |X1 X3 N3 ) ≤ H(Y1 |N3 ). We have equality by choosing X2 independent of (X1 , X3 ) because of the Markov chain Y1 N3 ↔ X2 ↔ X1 X3 . 4) If s3 = 2 and s1 ∈ {1, 3} instead, then

Example 6.4 Consider the channel defined as follows. Y2 = X1 + N2 Y3 = X2 + N3 Y1 = X3 + N1 where N1 , N2 and N3 are arbitrarily correlated noise independent of the channel inputs. The channel satisfies (6.15) with (s1 , s2 , s3 ) = (3, 1, 2), and (6.16) with gi (x, y) = y − x for all i ∈ [3]. Thus, by Proposition 6.3, mixed source emulation is optimal. If the channel noises are independent, pure source emulation is optimal. In the special case when the channel

(6.19)

with the following assumptions. n.b. in the special case when D = ∅ and N0 deterministic, Assumption 2, 4 and 5 below are automatically satisfied. Those assumptions are technicalities for the case when D ̸= ∅. 1) The channel input Xi and output Yi for user i ∈ V take values from the finite abelian group (Xi , +) and (Yi , +) respectively. The individual channel noise Ni for user i ∈ V takes values from Yi . 2) The common channel noise N0 is uniformly distributed over the finite abelian group (N0 , +), i.e. PN0 (n0 ) =

H(Y1 |X1 X3 Y3 ) = H(N1 |X1 X3 Y3 ) ≤ H(N1 |Y3 ). We have equality by choosing X2 independent of (X1 , X3 ) because of the Markov chain Y3 N1 ↔ X2 ↔ X1 X3 . Hence, mixed source emulation is optimal by Theorem 6.2. Indeed, pure source emulation is optimal since ΛA|D is a singleton and so (3.17b) equals (3.17c).

∀i ∈ V

Yi = Mi (XV , N0 ) + Ni

1 |N0 |

∀n0 ∈ N0 .

(6.20)

3) Mi is a homomorphism for all i ∈ V . i.e. for all xV , x′V ∈ XV and n0 , n′0 ∈ N0 , Mi (xV + x′V , n0 + n′0 ) = Mi (xV , n0 ) + Mi (x′V , n′0 ). (6.21) 4) ND is determined by YD , i.e. for all nD , n′D ∈ YD such that nD ̸= n′D and PN (nD ), PN (n′D ) > 0, Mi (XV ,N0 ):=

n′D

− nD

z }| { ∈ / {Mi (xV , n0 ) : xV ∈ XV , n0 ∈ N0 } . (6.22)

In other words, the support of ND has at most one element from each coset of the subgroup MD (XV , N0 ) of YD . 5) There exists a special user s ∈ Dc such that PN0 ,NV |XV = PN0 · PNDc · PND |Ns .

(6.23)

19

Furthermore, uniform PXV implies I(YD ∧ XDc \{s} |XD ) = 0

(6.24)

Roughly speaking, uniformly distributed input for user s completely jams the channel from the trusted users to the untrusted users. 2 Theorem 6.6 For the finite homomorphic network defined above, pure source emulation with uniform PXV attains the secrecy capacity ∑ ] [ CFHN := min λB log MBBc (XB , 0) + H(SB ) λ∈ΛA|D

B∈HA|D

] c Dc (XD , 0) + H(SD ) − log MD ∑ − ( B λ − 1) H(MV (0, N0 ) + NV ) [

where (a) is because Y determines S, and (b) is because Y ∈ S. This gives (6.26) because |S| = |M (X)|. To show the equality case, suppose PX is uniform. For all n ∈ Y and y ∈ M (X) + n, (a)

PY|N (y|n) =

|Null(M )| (b) 1 = |X| |M (X)|

(6.28)

where the R.H.S. of (a) is the probability that X−x ∈ Null(M ) for some particular solution x to y = M (x) + n, and (b) is a well-known identity in linear algebra. Hence, for all y ∈ S, (a) ∑ PY|S (y|S) = PYN|S (y, n|S) n∈S (b)

=

∑

PN|S (n|S)PY|N (y|n) =

n∈S

1 |M (X)|

where MiB (xB , xB c ) := Mi (xV , 0) and SB is a random variable which denotes the coset of MBBc (XB , 0) that contains MB c (0, N0 ) + NB c . 2

where the summation in (a) is over S since N ∈ S by (6.27), (b) is because N determines S, and (c) follows from (6.28). Thus, H(Y|S) = log|M (X)| as desired.

The proof relies on the group structure that can be captured by the following simple finite homomorphic channel.

Finite homomorphic channels need not be symmetric in general. Nonetheless, the lemma says that the output entropy is maximized by uniform input distribution. The essence of the proof lies in the property that the effective channel PY|XS (·|·, S) conditioned on the maximum common function S of Y and N is strongly symmetric [23]. In particular, for all x ∈ X, y ∈ S,

Simple finite homomorphic channel: We say PY|X is a (single-input single-output) finite homomorphic channel if Y = M (X) + N (6.25) for some M and N such that · X and Y take values from the finite abelian group (X, +) and (Y, +) respectively, · N ∈ Y is independent of X, and · M is a homomorphism, i.e. M (x + x′ ) = M (x) + M (x′ )

∀x, x′ ∈ X.

We write · M (X) := {M (x) : x ∈ X} as a subgroup of Y, · M (X) + n := {M (x) + n : x ∈ X} for n ∈ Y as a coset of M (X) in Y , · P(M (X)) := {M (X) + n : n ∈ Y } as the partition of Y into cosets of M (X), and · Null(M ) := {x ∈ X : M (x) = 0} as the kernel of M .

PY|XS (y|x, S) = PY|XS (y + M (x′ − x)|x′ , S) = PY|XS (y ′ |x + x ¯(y ′ − y), S)

∀x′ ∈ X ∀y ′ ∈ S

where x ¯(y ′ − y) denotes a particular solution ∆ ∈ X to y ′ − y = M (∆). Thus, uniform input leads to uniform output, which maximizes H(Y|S = S), the only component of H(Y) that depends on PX . We can now prove Theorem 6.6 using this special structure of finite homomorphic channels. P ROOF (T HEOREM 6.6) By (F.1c), α(λ, PXV ) =

∑

, 1

z }| { λB [H(YB c |XB c ) − H(YD |XD )]

B∋s

+

∑

λB [H(YB c |XB c ) − H(YV |XV )] .

B̸∋s

Lemma 6.1 For the simple finite homomorphic channel, H(Y) ≤ log|M (X)| + H(S)

(6.26)

where S is the unique coset from P(M (X)) containing N, i.e. N ∈ S ∈ P(M (X)). Equality holds for (6.26) if PX is uniform.

(6.27) 2

P ROOF Y ∈ S if and only if N ∈ S since Y − N ∈ M (X). In other words, S is not only determined by N, but also by Y.9 Thus, (a) H(Y) = H(Y, S) = H(Y|S) + H(S) (b)

≤ log|S| + H(S)

9 Indeed,

S is the maximum common function defined in [36]. It is the common function of Y and N with maximum entropy.

Using the result from Lemma 6.1, it is straightforward to show that the secrecy capacity claimed in the theorem is obtained from minλ∈ΛA|D α(λ, PXV ) with uniform PXV . By Theorem 6.1, it suffices to show that uniform PXV maximizes α(λ, PXV ). We can ignore the last entropy term H(YV |XV ) since it is independent of PXV . More precisely, H(YV |XV ) = H(YV − MV (XV , 0)|XV ) = H(MV (0, N0 ) + NV ) by (6.19), (6.21) and (6.23). Given XB c = xB c ∈ XB c , we have from (6.19) that YB c = MBBc (XB , xB c ) + MB c (0, N0 ) + NB c = MBBc (XB , 0) + N

20

with N := MBBc (0, xB c )+MB c (0, N0 )+NB c . This is a simple homomorphic finite channel, and so by Lemma 6.1, H(YB c |XB c = xB c ) ≤ log|MBBc (XB , 0)| + H(SB ) with equality if PXB is uniform. Thus, uniform PXV achieves the maximum H(YB c |XB c ) =

log|MBBc (XB , 0)|

B

+ H(S )

The remaining term , 1 in α(λ, PXV ) can be bounded as , 1 = H(YB c \D |XB c YD ) − I(YD ∧ XB c \D |XD ) ≤ H(YC |XB c YD ) with C := B c \ D with equality if PXV is uniform by (6.24). It remains to show that uniform PXV maximizes H(YC |XB c YD ) where B ∈ HA|D : s ∈ B and C := B c \ D. Indeed, we need only consider the special case without N0 . More precisely, let { (Xi , N0 ) if i = s ˜ Xi = Xi otherwise. ˜ i ) + Ni for some homomorphism ˜ i (X Then, we have Yi = M ˜ Mi by (6.19) and (6.21). Furthermore, ˜ Dc \{s} |X ˜ D ) = I(YD ∧ XDc \{s} |XD ) I(YD ∧ X which equals 0 by (6.24) if PX˜ V is uniform (which happens if and only if PXV is uniform by (6.20)). Since s ∈ / Bc, ˜ B c , YD ) = H(YC |XB c , YD ). H(YC |X ˜ B c YD ), uniform PX also If uniform PX˜ V maximizes H(YC |X V c maximizes H(YC |XB , YD ). We can therefore focus on the case without N0 , i.e. Yi = Mi (XV ) + Ni

∀i ∈ V.

(6.29)

We now simplify the condition on YD as a condition on XB . Define · · · ·

n ¯ D (yD ) as the value of ND given YD = yD ∈ YD by (6.22), SXB (xB c , yD ) := {xB ∈ XB : PYD |XV (yD |xV ) > 0}, SXBc YD := {(xB c , yD ) ∈ XB c × YD : SXB (xB c , yD ) ̸= ∅}, x ¯B (xB c , yD ) as a particular solution xB ∈ XB to yD = MD (xV ) + n ¯ D (yD ) for (xB c , yD ) ∈ SXBc YD .

It follows from (6.29) that B SXB (xB c , yD ) = x ¯B (xB c , yD ) + Null(MD (·, 0)).

Furthermore, for (xB c , yD ) ∈ SXBc YD , H(YC |XB c = xB c , YD = yD ) (a)

= H(YC |XB c = xB c , Y = yD , ND = n ¯ D (yD )) (b)

= H(YC |XB c = xB c , XB ∈ SXB (xB c , yD ), ND = n ¯ D (yD ))

(c)

= H(YC |XB c = xB c , XB ∈ SXB (xB c , yD )). (a) YD = yD implies ND = n ¯ D (yD ). (b) Conditioned on (XB c , ND ) = (xB c , n ¯ D (yD )), we have YD = yD if and only if XB ∈ SXB (xB c , yD ).

(c) (YC , XV , NC ) is independent of ND by (6.23).10 It suffices now to show that uniform PXV maximizes H(YC |XB c = xB c , XB ∈ SXB (xB c , yD )) to a value independent of (xB c , yD ) ∈ SXBc YD . Conditioned on XB c = xB c and XB ∈ SXB (xB c , yD ), YC = MCB (XB , xB c ) + NC = MCB (X, 0) + N

where

X := XB − x ¯B (xB c , yD ) N := MCB (−¯ xB (xB c , yD ), xB c ) + NC . B XB ∈ SXB (xB c , yD ) implies X ∈ Null(MD (·, 0)). Viewing X and YC as an input and output to a simple finite homomorphic channel, uniform PXV attains the maximum by Lemma 6.1,

H(YC |XB c = xB c , XB ∈ SXB (xB c , yD )) B = log MCB (Null(MD (·, 0)), 0) + H(S) for some S where H(S) depends only on PNC but not (xB c , yD ) as desired. VII. S OURCE EMULATION IS S UBOPTIMAL In this section, we study how source emulation approach can be strictly suboptimal. In particular, we will construct an example where one can achieve a key rate strictly larger than the secrecy lower bound in Theorem 3.3. Consensus channel: Consider two active users 1 and 2, i.e. A = [2] = Dc = V . The DMMC is a binary consensus channel defined as follows: user input output

1 X1 ∈ {0, 1} Y ∈ {0, 1}

2 X2 ∈ {0, 1} Y

The output bit Y is defined as { X1 if X1 = X2 Y := N otherwise

(7.1)

where N is a uniformly random bit independent of the channel input bits X1 and X2 . We can think of the channel as a way to reach consensus Y by a random coin flip if the users do not provide identical input bits. More practically, we can think of Y as a noisy observation that depends on the inputs if and only if they adds up coherently. By definition (C.1), we have HA|D = {{1}, {2}}. There is only one possible fractional partition λ in ΛA|D , namely the one with λ{1} = λ{2} = 1. All the minimization over fractional partitions in the secrecy bounds become trivial. Furthermore, the secrecy lower bounds by pure and mixed source emulations are the same. From the computations in ‡H in the appendices, the secrecy lower bound Rse in (H.5) by source emulation is strictly smaller than the secrecy upper bound Csu in (H.6). Rse ≈ 1.12 < Csu ≈ 1.17 10 More precisely, s ∈ / C and so (XV , NC ) is independent of ND . YC is a function of XV and NC .

21

Using this, we will prove that private input adaptation strictly outperforms the mixed source emulation (or any public input adaptation) for the DMMC below. Augmented consensus channel: Let PY1V |XV be the consensus channel defined in (7.1) and PY2V be a DMMS with [ ] [1 1] P (0, 0) PY21 Y22 (0, 1) PY2V := Y21 Y22 := 61 31 PY21 Y22 (1, 0) PY21 Y22 (1, 1) 3 6 where PY2V is the optimal input distribution that gives the secrecy upper bound (H.6) for the consensus channel. The augmented consensus channel PYV |XV is defined as PYV |XV = PY1V |XV PY2V which corresponds to the simultaneous and independent use of the consensus channel PY1V |XV and the DMMS PY2V . By Theorem 3.3, the secrecy lower bound by source emulation for the augmented consensus channel is ˜ se := min R

max

λ∈Λ[2]|∅ PXV =PX1 PX2

(a)

= =

max

γ(1, PXV )

max

γ(1, PXV )|PY

PXV =PX1 PX2

(b)

˜ PX ) β(λ, V

PXV =PX1 PX2

1V |XV

+ β(1, PY2V )

= Rse + β(1, PY2V ) ≈ 2.04 (a) The minimization is trivial because there is only one possible fractional partition, namely 1. Furthermore, β˜ = γ by Proposition F.4 since D = ∅ satisfies the single-leakage condition (F.11). (b) This is by the equality case of (F.14b) and the equalities ˜ 1) γ(1, 1)|PY = β(1, = β(1, PY2V ) PY2V

which follows from Proposition F.4 and the definition (F.3) of β. (c) This is by definition of Rse in ‡H-B. By Theorem 3.1, the secrecy upper bound is C˜su := min max α(1, PXV ) λ∈Λ[2]|∅ PXV

(a)

= max α1 (1, PXV )|PY PXV

1V |XV

+ β(1, PY2V )

(b)

= Csu + β(1, PY2V ) ≈ 2.09

(a) The minimization is trivial because there is only one possible fractional partition. (a) follows from the equivalence α = α1 by Proposition F.4 due to the single-leakage condition (∵ D = ∅), the equality case of (F.14a), and the equalities ˜ 1) α1 (1, 1)|PY = β(1, = β(1, PY2V ) 2V

Xit = Y2i(t−1)

∀i ∈ {1, 2}

Asymptotically, it is equivalent to a DMMS YV = Y[2]V with PY1V |Y2V = PY1V |XV . It is straightforward to show that C˜su is the key rate achievable by Theorem 3.4. Hence, the private input adaptation scheme is strictly better than the source emulation approach. The reason why source emulation is suboptimal here is because the users can cooperate privately in generating the channel inputs by observing previous channel outputs. The mixed source emulation approach, however, restricts the users to cooperate only through a public auxiliary source, ignoring the benefit of private input adaptation. For private networks where each user can either transmit or receive but not both, private input adaptation is not possible and so it remains open whether source emulation is optimal for that case. Problem: Is the secrecy upper bound (3.8) loose for the coupling channel defined in §IV?

(c)

2V

Since Rse < Csu , we have C˜se < C˜su as desired. It remains to show that C˜su is achievable by some other scheme. Consider setting the input Xit from user i ∈ V at time t ∈ [n] to the previous observation Y2i(t−1) from the DMMS, i.e.

PY2V

which follows from Proposition F.4 and the definition (F.3) of β. (b) This is by the definition of Csu in ‡H-B and α = α1 by Proposition F.4.

VIII. C ONCLUSION In this work, we studied how multiple users can agree on a common secret key by public discussion if they have access to a private discrete memoryless multiterminal channel. Unlike the previous broadcast channel model where only one user can transmit, we considered a more general multiterminal network model where every user may both transmit and receive. Continuous channel model is also studied under the usual average power constraint, with a slight modification on the achieving scheme and a different achievable key rate. For the achieving scheme, we considered the source emulation approach that turns the private channel into a distributed source by generating random channel inputs. Unlike the previous pure source emulation approach where the users generate independent inputs, we considered a mixed source emulation approach where the users generate correlated inputs based on a common public auxiliary random source. Similar to mixed strategies in zero-sum game, the auxiliary random source is introduced to mix between different channel input distributions so as to attain a larger payoff, i.e. a larger achievable key rate. We have shown more concretely by an example that mixed source emulation outperforms the pure source emulation due to the additional coordination through the auxiliary source. The achievable rate is derived as a single-letter lower bound to the secrecy capacity. Unlike the previous source and broadcast channel models, the capacity is unknown for the general multiterminal network. However, a single-letter upper bound was derived and shown to match the lower bound under various general conditions. This includes a large class of interference-free channels not covered by the previous broadcast channel model. We also consider a different class of finite homomorphic channels, which can be

22

viewed as a generalization of the usual network models in [12, 15, 35] for network coding. Due to the group structure of the channels, it is optimal to have each sender transmit uniformly independent input symbols and so cooperation turns out to be unnecessary in this case. Indeed, for all the networks where mixed source emulation is proved optimal, it is also not clear whether pure source emulation suffices. i.e. we do not have a concrete example for which mixed source emulation is optimal but pure source emulation is not. When the secrecy upper and lower bounds do not match, it is unclear whether it is the upper bound or the lower bound that is loose. However, we have given an example for which mixed source emulation can be strictly suboptimal, i.e. the lower bound can be loose. In that example, the channel statistics is designed carefully so that the secrecy upper bound cannot be attained by the mixed source emulation approach of adapting the channel inputs to a public auxiliary random source. Nonetheless, the bound can be attained by feeding parts of the previous channel outputs privately into the channel inputs. The reason is that the users can correlate their channel inputs optimally to enhance the correlation among the channel outputs by simply using part of the correlation in their previous channel outputs. For general channel statistics, however, a more general private input adaption scheme is desired to transform part of the correlation from the outputs optimally into the inputs to enhance the overall correlation. For the special private networks where each user can either transmit or receive but not both, it is not possible to adapt the channel inputs privately to the channel outputs. In particular, the example shows that mixed source emulation is suboptimal is not under this model. Although another example is available under this model with the secrecy bounds unmatched, we have not been able to rule out the other possibility that the secrecy upper bound is loose. For the infinitely-valued model under the sample average constraint, in particular, we believe that the secrecy upper bound may be overly weakened. A PPENDIX A I NFORMATION MEASURES In this section, we will give the basic definitions for entropy, mutual information and divergence, and point out certain caveats and technicalities needed when handling a mixture of discrete and continuous random variables. Reader may refer to [23] and [22] for more details. The randomness of a purely discrete random variable is measured by the entropy defined as [ ] ∑ 1 1 = PX (x) log H(X) := E log PX (X) PX (x) X∼PX x∈X

where PX is the probability mass function of X. For continuous random variable Y ∈ R with probability measure absolutely continuous with respect to the Lebesgue measure11 , differential entropy is used, i.e. [ ] ∫ 1 1 H(Y) := E log = PY (y) log dy PY (Y) P (y) Y R 11 Absolute continuity is the technical condition needed in the fundamental theorem of calculus for the Lebesgue integral [37].

where PY is the probability density function of Y. For a mixedpair Z := (X, Y) of discrete and continuous random variables X and respectively Y, the entropy is defined in [22] as [ ] 1 H(Z) := E log PX,Y (X, Y) ∑∫ 1 = PX,Y (x, y) log dy PX,Y (x, y) R x∈X

where PX,Y (x, y) is the probability density function of Pr(X = x, Y ≤ y) in y. For the multivariate case involving a mixture of multiple discrete and continuous random variables, the entropy measure can be defined similarly as follows. Entropy: Consider a random vector Z = (Zi : i ∈ [ℓ + 1]) where ℓ ∈ P is a positive integer, Zℓ+1 is a discrete random variable with possibly unbounded support Zℓ+1 and Zi ’s for i ∈ [ℓ] are real-valued continuous random variables with absolutely continuous probability measure. Let PZ (z[ℓ+1] ) = PZℓ+1 (zℓ+1 )PZ[ℓ] |Zℓ+1 (z[ℓ] |zℓ+1 ) for all z[ℓ+1] ∈ Rℓ × Zℓ+1 , where PZℓ+1 is the probability mass function for Zℓ+1 and PZ[ℓ] |Zℓ+1 (·|zℓ+1 ) is the conditional probability density function for Z[ℓ] given Zℓ+1 = zℓ+1 . The entropy is defined as, [ ] 1 H(Z) := E log (Entropy) PZ (Z) ∫ (A.1) ∑ = PZ (z[ℓ+1] ) log PZ (z1[ℓ+1] ) dz[ℓ] . zℓ+1 ∈Zℓ+1

Rℓ

The conditional entropy is defined in the usual way, [ ] H(Z|U) = E − log PZ|U (Z|U) (Conditional entropy) which is H(Z, U) − H(U). For the entropy to be well-defined, the following constraint is imposed [22], ∑ ∫ PZ (z[ℓ+1] ) log PZ (z[ℓ+1] ) dz[ℓ] < ∞. (A.2) zℓ+1 ∈Zℓ+1

Rℓ

We will require further that the above holds also for Riemann integral [37], which is a technicality needed for Lemma B.1. We consider only the general multiterminal network in §II for which the above entropy measure applies and is welldefined. In particular, we do not consider the more general situation where a random variable has a mixture probability distribution [38], e.g. Z is uniform over the continuous interval (−1, 1) with probability 12 and uniform over the discrete set {−1, 1} with probability 12 . If such a random variable arises in the system, either as a channel input or a channel output, it is possible to map it to a mixed-pair of discrete and continuous random variables such as the way described in [22]. e.g. we can map Z above to the mixed-pair (X, Y) as follows: X = −1 and Y = Z if Z ∈ (0, 1), and X = Z and Y = U if Z ∈ {0, 1}, where U is uniformly over [0, 1] and independent of Z. The above mapping can be performed implicitly by the user who control or observes Z and generate U as a private

23

randomization in §II.12 Thus, it is valid to replace Z by (X, Y) in the model. Doing so does not lose optimality either because Z is can be recovered from (X, Y) by construction. The usual properties of the classical entropy measures follow immediately from these definitions. For X, Y and Z that are mixtures of discrete and continuous random variables, we have the chain rule expansion, H(XY) = H(X) + H(Y|X) = H(Y) + H(X|Y)

(A.4)

Equality holds if and only if X and Y are conditionally independent given Z. For any discrete random variable X, we have the additional positivity property that, H(X|Y) ≥ 0

(A.5)

with equality if and only if Y, which may be continuousvalued, completely determines X. This entropy is also preserved under bijection. For general mixed-pair that has a continuous component, the entropy needs not be positive, just like the differential entropy, nor does it have to be preserved under bijection. In particular, it is not invariant under scaling of the random variable. The information divergence between two distributions PZ and PZ′ is defined as, ] [ PZ (Z) ′ D(PZ ∥PZ ) := E log ′ (Divergence) (A.6) PZ (Z) Z∼PZ where the expectation is taken over Z distributed as PZ . D(p∥q) satisfies the usual positivity and convexity in (p, q) by Jensen’s inequality [23]. ˜ ∥g) We will also consider the following generalization D(f of the divergence operation to non-negative but not necessarily stochastic functions f, g : Rℓ × Zℓ+1 7→ R+ on the mixture of discrete set Zℓ+1 and finite-dimensional Euclidean space Rℓ . ∑∫ f (z[ℓ+1] ) ˜ ∥g) := D(f (A.7) f (z[ℓ+1] ) log dz[ℓ] g(z[ℓ+1] ) Rℓ Zℓ+1

˜ ∥g) is also convex in (f, g) by the log-sum inequality [23] D(f but it may not be positive. A PPENDIX B Q UANTIZATION We use the following quantizer to convert an infinitelyvalued source model to a finitely-valued source model. Quantizer: For b > ∆ > 0 such that b \ ∆ ∈ P is a positive integer, define the quantization function f∆,b : R 7→ {0, . . . , 2b/∆} as { 0 if y ∈ / [−b, b) f∆,b (y) := (B.1a) j if y ∈ [−b + (j − 1)∆, −b + j∆). 12 The

.0

.1

.2

purpose of U is to ensure that Y is a continuous random variable.

.3

.4

.0

.z ∈ {0} ∪ [2b/∆]

.PZ (3) .y ∈ R

.

(A.3)

and the fact that successive conditioning reduces entropy, or equivalently, the positivity of mutual information, I(X ∧ Y|Z) := H(X|Z) − H(X|YZ) = H(Y|Z) − H(Y|XZ) ≥ 0.

.PY (y)

.−b

. 0

. ∆

. ∆

.b

Fig. 5. A continuous random variable Y is quantized to Z := f∆,b (Y). Boundaries are at {j∆ ∈ [−b, b] : j ∈ Z} ∪ {−b, b}.

The range of the quantizer is finite as desired, ∥f∆,b ∥ = 1 + 2b/∆ < ∞.

(B.1b)

Fig. 5 illustrates how the quantization turns a continuous random variable into a finitely-valued random variable. We can apply this quantizer to each output symbol of the infinitely-valued component of the source,13 leading to a finitely-valued component. For notational simplicity, we use f∆,b (YV ) to denote the output with all infinitely-valued components in YV quantized by f∆,b but leaving the finitelyvalued components intact. The following technical lemma is analogous to Theorem 9.3.1 in [23]. Lemma B.1 (Quantization) Let Y be a real-valued random variable with density function PY such that, ∫ ∞ (B.2) |PY (y) log PY (y)| dy < ∞ −∞

then we have for Z := f∆,b (Y) and Z0 := χ{Z ̸= 0} that lim lim [H(Z|Z0 = 1) + log ∆] = H(Y)

b→∞ ∆→0

(B.3)

where f∆,b is the quantizer defined in (B.1).

2

Corollary B.1 Given Y = (Yi : i ∈ [ℓ + 1]) is a mixture of a discrete random variable Yℓ+1 and a continuous random vector Y[ℓ] with ℓ continuous real-valued components, such that the joint densities PY[ℓ+1] (·, yℓ+1 ) for yℓ+1 ∈ Yℓ+1 are absolutely continuous and ∑ ∫ PY (y[ℓ] ) log PY (y[ℓ] ) dy[ℓ] < ∞. (B.4) [ℓ] [ℓ] yℓ+1 ∈Yℓ+1

Rℓ

We have for Z := f∆,b (Y) and Z0 := (χ{Zi ̸= 0} : i ∈ [ℓ]) lim lim [H(Z|Z0 = 1) + ℓ log ∆] = H(Y)

b→∞ ∆→∞

where f∆,b is a quantizer defined in (B.1).

(B.5) 2

P ROOF Consider proving Lemma B.1 first. We first relate the conditional distributions of Z and Y given Z0 = 1 using the mean-value theorem, and then prove the desired 13 For discrete component with unbounded support, we can assume the support set is the set Z of integers without loss of generality.

24

convergence (B.3) under the condition (B.2) for the entropy measure to be well-defined. From the definition (B.1), we have for all j ∈ [2b/∆] such that PY|Z0 (·|1) is continuous over the interval [−b + (j − 1)∆, −b + j∆], ∫ −b+j∆ PZ|Z0 (j|1) = PY|Z0 (y|1) dy = PY|Z0 (yj |1)∆ (B.6)

the distributions of Z and Y through the mean-value theorem, with an ℓ-fold integral and a factor of ∆ℓ instead of ∆ in (B.6). This gives the ℓ log ∆ terms in (B.5)

for some tag yj ∈ [−b + (j − 1)∆, −b + j∆] by the meanvalue theorem. The desired convergence can be proved in two stages, ∑ ∆ H(Z|Z0 = 1) + log ∆ = PZ|Z0 (j|1) log PZ|Z0 (j|1) j∈[2b/∆] ∫ b 1 (a) ∆→0 −−−−−−→ PY|Z0 (y|1) log dy P Y|Z0 (y|1) −b

In this section, we introduce a set of inequalities from [39] and [14], collectively referred to as the Shearer-type lemma, which is built upon the following combinatorial structure.

−b+(j−1)∆

(b) b→∞

−−−−−−→H(Y). (a) This is by (B.2) that PY|Z0 (·|1) log PY|Z 1(·|1) is Riemann0 integrable over [−b, b]. More explicitly, ∫ b 1 PY|Z0 (y|1) log dy P Y|Z0 (y|1) −b ∑ 1 = lim ∆PY|Z0 (yj |1) log ∆→0 PY|Z0 (yj |1) j∈[2b/∆]

∑

= lim

∆→0

j∈[2b/∆]

PZ|Z0 (j|0) log

∆ PZ|Z0 (j|1)

where the last equality is by (B.6). (b) Since (B.2) implies ∫ ∞ lim [PY (y) + PY (−y)] dy = 0 b→∞

b

we have limb→∞ Pr(Z0 = 0) = 0. Since limx↓0 x ln x = 0, we also have PZ0 (0) log PZ0 (0) → 0 and H(Z0 ) → 0 as b → ∞. By the chain rule, H(Y) − H(Y|Z0 ) = H(Z0 ) − H(Z0 |Y) b→∞ ≤ H(Z0 ) −− −−→ 0

Thus, R.H.S. of (a) can be expressed as 1 H(Y|Z0 = 1) ≈ [H(Y) − PZ0 (0)H(Y|Z0 = 0)] PZ0 (1) with equality in the limit as b → ∞. To show the desired convergence to H(Y), it suffices to show that the following term converges to 0. PZ0 (0)H(Y|Z0 = 0) ∫ PZ (0) PY (y) = PZ0 (0) log 0 dy PY (y) (−∞,b]∪[b,∞) PZ0 (0) ∫ PZ (0) = PY (y) log 0 dy PY (y) (−∞,b]∪[b,∞) ∫ 1 = PZ0 (0) log PZ0 (0) + PY (y) log dy PY (y) (−∞,b]∪[b,∞) which goes to zero as b → ∞ by (B.2) This complete the proof of Lemma B.1. Corollary B.1 is a straight-forward extension to the vector case. We again relate

A PPENDIX C S HEARER - TYPE LEMMA

Fractional partition: For finite sets A, D and V satisfying A ⊆ Dc ⊆ V : |A| ≥ 2 where Dc := V \ D denotes the complement, we define the following set system/hypergraph without multiple edges, HA|D := {B ( Dc : B ̸= ∅ and B ̸⊇ A}

(C.1)

Each element in HA|D is called a hyperedge, which is just a subset of the vertices in Dc . The corresponding set of all fractional (edge) partitions is defined as, { ΛA|D := λ = (λB : B ∈ HA|D ) : } ∑ c λB ≥ 0 and λB = 1 , ∀i ∈ D (C.2) B∋i

∑ ∑ where B∋i is a shorthand notation for B∈HA|D :i∈B . This is illustrated in Fig. 6. The name fractional partition comes from the constraint that every vertex in Dc is covered by its incident edges to a total weight of one. Any vertices in D, such as vertex 4 in Fig. 6, are not covered at all. We say that a fractional partition is basic if it is not a convex combination of other fractional partitions. For instance, the fractional partitions in Fig. 6(a), 6(b) and 6(c) are basic but the one in 6(d) is not. It is straightforward to show that ΛA|D is a convex set, and in particular, the convex hull of the basic fractional partitions. In the derivation of the secrecy upper bound, we use the following specialized versions of the Shearer-type lemma. Lemma C.1 (Shearer-type lemma for entropy functions) Consider any fractional partition λ ∈ ΛA|D defined in (C.2) over the hypergraph HA|D . For any random vector XV that is a mixture of discrete and continuous random variables, we have the weak form of the Shearer-type lower bound that ∑ H(XDc |XD ) ≥ λB H(XB |XB c ). (weak) (C.3a) B∈HA|D

With the conditional independence constraint that PXDc |XD = ∏ i∈D c PXi |XD , we have the equality that ∑ H(XDc |XD ) = λB H(XB |XB c ). (equality) (C.3b) B

Consider some discrete random variable F[r] satisfying the causal relation that, for all j ∈ [r], Fj is determined by

25

.4 ,

.1 , .2 , (1)

(2)

(a) λ{1} = λ{2,3} = 1

.

.4 ,

.1 , .2 ,

.3 ,

(1)

.

.4 ,

.

.1 , .2 ,

.3 , (2)

(3)

(3)

.1 , .2 ,

.3 , (3)

(c) λ{1} = λ{2} = λ{3} = 1

(b) λ{2} = λ{1,3} = 1

.

.4 ,

(d)

1 (λ(1) 2

.3 , + λ(2) )

Fig. 6. Fractional partitions of HA|D where A = [2], D = {4} and V = [4], i.e. HA|D = {{1}, {2}, {3}, {1, 3}, {2, 3}}. λ(k) for k ∈ [3] defined in (a), (b) and (c) respectively are all the basic fractional partitions. (d) is a non-basic fractional partition with weight 12 over the hyperedges in HA|D \ {3}.

(XD , Xij , F[j−1] ) for some ij ∈ V . Then, ∑ H(F[r] |XD ) ≥ λB H(F[r] |XB c ). (causal)

(C.3c) 2

B

P ROOF (C.3a) is obtained from [39]. More precisely, by the chain rule∑ (A.3) and the constraint (C.2) on fractional partitions that B∋i λB = 1 for all i ∈ Dc , ) ∑ (∑ H(XDc |XD ) = λ B∈HA|D B∋i B H(Xi |X[i−1] XD ) i∈D c

∑

=

λB

B∈HA|D

∑

(a)

≥

∑

H(Xi |X[i−1] XD )

i∈B

λB

B∈HA|D

=

∑ ∑

H(Xi |X[i−1]∩B XB c )

i∈B

λB H(XB |XB c )

B

where (a) is due to the fact that conditioning reduces entropy (A.4) since B c ) D. This proves the weak form (C.3a). The equality case (C.3b) follows from the fact that (a) is satisfied with equality if Xi ’s are conditionally independent given XD . Finally, (C.3c) can be derived as in [14]. ) ∑ (∑ H(F[r] |XD ) = B∈HA|D :B∋ij λB H(Fj |F[j−1] XD ) j∈[r]

∑

=

B∈HA|D

(a) ∑

≥

λB

∑

λB

H(Fj |F[j−1] XD )

j∈[r]:ij ∈B

∑

B

j∈[r]:ij ∈B

B

j∈[r]

H(Fj |F[j−1] XB c )

∑ (b) ∑ = λB H(Fj |F[j−1] XB c ) =

∑

λB H(F[r] |XB c )

B

where (a) again follows from the fact that conditioning reduces entropy (A.4) since B c ) D, and (b) follows from (A.5) that H(Fj |F[j−1] XB c ) = 0 if ij ∈ B c since Fj is determined by (XD , Xij , F[j−1] ) and therefore (XB c , F[j−1] ).

A PPENDIX D M INIMAX - TYPE LEMMA The secrecy upper and lower bounds in §III are both expressed in terms of some minimax optimization problems. The secrecy lower bounds §III-B, in particular, are derived using the following minimax-type lemma. Lemma D.1 (Minimax-type lemma) Consider any function ζ : Λ × S 7→ R such that, for all s ∈ S, ζ(λ, s) is convex and continuous in λ over a compact convex set Λ. Then, we have sup

min EQ [ζ(λ, σ(Q))] = min sup ζ(λ, s) (D.1)

k∈P,PQ ∈P([k]), λ∈Λ σ:[k]→S

λ∈Λ s∈S

where Q has distribution PQ over the finite set [k].

2

The secrecy lower bounds in (3.15b) and (3.17b) correspond to the special case where ζ is β˜ (F.2), Λ is ΛA|D (C.2), and S is the set of valid input distributions PXV . In the language of a two-person zero-sum game, ζ(λ, s) is the payoff for Player 1 and the cost for Player 2 when they play s and λ respectively. The expectation on the L.H.S. of (D.1) is the guaranteed payoff when Player 1 plays a mixed strategy of choosing σ(q) ∈ S with probability PQ (q), while Player 2 plays the best response λ ∈ Λ to Player 1’s strategy. The above lemma can be derived easily from the generalization of von Neumann’s minimax theorem in [40, Theorem 2]. We will derive this lemma from the following generalization of von Neumann’s minimax theorem in [40]. Theorem D.1 (Minimax [40]) For any family F of Λ 7→ R, sup min f (λ) = min sup f (λ)

f ∈F λ∈Λ

λ∈Λ f ∈F

(D.2)

if the following conditions are satisfied. Conditions for minimax theorem: i) Λ is compact and every f ∈ F is lower semi-continuous. ii) For all g1 , g2 ∈ F, there exists g ∈ F with 1 [gi (λ) + gi2 (λ)] ≤ g(λ) 2 1

∀λ ∈ Λ

(D.3)

26

iii) For f ∈ F and u ∈ R, define

For k ∈ P, let Pk be the postulate that ∃g1 , . . . , gk ∈ F with

L(f, u) := {λ ∈ Λ : f (λ) ≤ u}.

(D.4)

Then, for all u ≤ min∩λ∈Λ supf ∈F f (λ), n ∈ P and f1 , . . . , fn ∈ F, the set i∈[n] L(fi , u) is connected. P ROOF (P ROOF OF L EMMA D.1) For k ∈ P, PQ ∈ P([k]) and σ : [k] 7→ R, define the function fk,PQ ,σ : Λ 7→ R as fk,PQ ,σ (λ) = EQ [ζ(λ, σ(Q))]. Set F as the collection of all functions fk,PQ ,σ over different choices of (k, PQ , σ). Then, the L.H.S. of (D.2) equals the L.H.S. of (D.1) by definition. The R.H.S. of (D.2) is min sup f (λ) = min sup max EQ [ζ(λ, σ(Q))] λ∈Λ f ∈F

λ∈Λ k,σ

PQ

= min sup max ζ(λ, σ(q)) λ∈Λ k,σ q∈[k]

where, for the last equality, ≤ is because the expectation over Q cannot exceed the maximum over q ∈ [k], while ≥ is because the maximum can be attained by making Q deterministically equal to the maximizing q. The last expression simplifies to the R.H.S. of (D.1) because maximizing ζ(λ, σ(q)) over (σ(q) ∈ S, q ∈ [k]) is the same as maximizing ζ(λ, s) over s ∈ S. Thus, (D.1) is equivalent to (D.2), and so it suffices to show that the conditions for (D.2) in Theorem D.1 are all satisfied as follows. i) EQ [ζ(λ, σ(Q))] is convex and continuous in λ because it is a non-negatively weighted sum of convex and continuous functions ζ(·, s) for a finite number of different s ∈ S. Thus, every f ∈ F is convex and continuous. ii) Consider any choices (ki , PQi , σi ) for i ∈ [2]. Set k = k1 + k2 . For q ∈ [k1 ], set PQ (q) = 12 PQ1 (q) and g(q) = g1 (q). For q ∈ {k1 +1, . . . , k}, set PQ (q) = 12 PQ2 (q−k1 ) and g(q) = g1 (q − k1 ). By construction, we have 1 ∑ fk,PQ ,σ (λ) = EQ [ζ(λ, σ(Q))] = EQi [ζ(λ, σi (Qi ))] 2 i∈[2]

∑ 1

which equals 2 i∈[2] fki ,PQi ,σi (λ) as desired. iii) As argued for Condition i, every f ∈ F is convex over the convex set Λ. This implies that L(f, u) is a convex set for all u ∈ R. n.b. the intersection of any two convex sets is again convex. Thus, for any n ∈ P, ∩ and f1 , . . . , fn ∈ F , the set i∈[n] L(fi , u) is convex and therefore connected. For completeness, the proof of Theorem D.1 is given below. P ROOF (P ROOF OF T HEOREM D.1) ≤ for (D.2) is trivial. To prove the reverse inequality, consider an arbitrary u with u < min sup f (λ). λ∈Λ f ∈F

(D.5)

λ∈Λ

λ∈Λ i∈[k]

(D.6)

since then supf ∈F f (λ) = supu u ≤ supg∈F minλ∈Λ g(λ).

(D.7)

n.b. the desired result follows directly from P1 . Consider proving the weaker statement that Pk is true for some large enough but finite k. By (D.4), it suffices to show that (a) ∩ (b) ∩ ∅= L(f, u) = L(gi , u) f ∈F

i∈[k]

for some k ∈ P and g1 , . . . , gk ∈ F. To∩explain (a), suppose to the contrary that there exists λ′ ∈ f ∈F L(f, u). Then, f (λ′ ) ≤ u for every f ∈ F by the definition (D.4) of L, contradicting (D.5). To explain (b), n.b. L(f, u) is closed by the lower semi-continuity of f . Thus, {Λ \ L(f, u) : f ∈ F} is an open cover in the compact space Λ, and so there exists a finite subcover,∪say {Λ \ L(gi , u) : i ∈ [k]}, satisfying ∪ f ∈F [Λ \ L(f, u)] = i∈[k] [Λ \ L(gi , u)], which implies (b) by De Morgan’s laws. ∩ For B = i∈[n] L(fi , u) for some n ∈ P and f1 , . . . , fn , let Sk (B) be the statement that u < min max gi (λ), λ∈B i∈[k]

(D.8)

there exists g˜ ∈ F with u < min g˜(λ). λ∈B

(D.9)

n.b. Sk (Λ) is the statement that Pk implies P1 , which gives the desired result since Pk holds for some finite k. Let S be the statement that Sk (B) holds for all possible values of B, i.e. all finite intersections of L(fi , u)’s. It suffices to prove Sk by induction on k since that implies Sk (Λ) and therefore P1 as desired. S1 holds trivially by setting g˜ = g1 . Consider proving Sk (B) for an arbitrary B by assuming Sk−1 as an inductive hypothesis. The premise (D.8) implies that u<

min

max gi (λ).

λ∈L(gk ,u)∩B i∈[k−1]

To explain this, suppose to the contrary that there exists λ′ ∈ L(gk , u) ∩ B with gi (λ′ ) ≤ u for all i ∈ [k − 1]. However, λ′ ∈ L(gk , u) ∩ B implies that gk (λ′ ) ≤ u as well, and so maxi∈[k] gi (λ′ ) ≤ u for some λ ∈ B, contradicting (D.7). n.b. L(gk , u) ∩ B is again a finite intersection of L(fi , u)’s and so the above inequality satisfies the premise (D.8) of Sk−1 (L(gk , u) ∩ B). By the inductive hypothesis Sk−1 , we have the consequence (D.9) of Sk−1 (L(gk , u)) that u < minλ∈L(gk ,u)∩B g˜(λ) for some g˜ ∈ F . Together with the fact that u < gk (λ) for λ ∈ B \ L(gk , u), we have u < min max{˜ g (λ), gk (λ)} λ∈B

It suffices to prove that there exists g ∈ F with u < min g(λ)

u < min max gi (λ).

(D.10)

It suffices to prove S2 (B) since (D.10) and S2 (B) with g1 and g2 replaced by g˜ and gk implies (D.9) as desired. Assume the premise (D.8) of S2 (B), i.e. u < min max gi (λ). λ∈B i∈[2]

(D.11)

27

We can construct the desired g˜ by the iteration.

minλ∈B g(λ) increases to some value larger than u, in which case setting g˜ = g gives (D.6) as desired.

Iteration: Let g ∈ F be the function satisfying (D.3). If L(f, u) ∩ B = ∅ for f ∈ {g1 , g2 , g}, then we have u < minλ∈B f (λ) and so setting g˜ = f gives (D.9) as desired. Consider instead L(f, u) ∩ B ̸= ∅

∀f ∈ {g1 , g2 , g}

The intersections satisfy the following two properties: I) L(g1 , u) ∩ B is disconnected from L(g2 , u) ∩ B. II) L(g, u) ∩ B is a subset of L(g1 , u) ∩ L(g2 , u) ∩ B. To explain I, n.b. L(gi , u)∩B’s are disjoint because, otherwise, λ′ ∈ L(g1 , u) ∩ L(g2 , u) ∩ B implies u < maxi∈[2] gi (λ′ ) with λ′ ∈ B, contradicting (D.11). Since L(gi , u) ∩ B’s are finite intersections of closed sets and are therefore closed, disjointness implies disconnectedness. More precisely, suppose to the contrary that there exists λ1 ∈ L(g1 , u)∩B such that its neighborhood always contain some point λ2 ∈ L(g2 , u) ∩ B. λ1 ∈ L(g1 , u) ∩ B implies λ1 ̸∈ L(g2 , u) ∩ B by disjointness, and so u < g2 (λ1 ). By making the neighborhood of λ1 arbitrarily small, the point λ2 in the neighborhood has g2 (λ2 ) arbitrarily close to g2 (λ1 ) > u, contradicting λ2 ∈ L(g2 , u) ∩ B that g2 (λ2 ) ≤ u. To explain II, suppose to the contrary that there exists λ′ ∈ ′ L(g, u) ∩ B not in L(g1 , u) ∩ L(g2 , u) ∩ B, ∑ i.e. u < g′ i (λ ) for 1 ′ i ∈ [2] but g(λ ) ≤ u. It follows that 2 i∈[2] gi (λ ) > u > g(λ′ ), contradicting (D.3). By condition iii for (D.2), L(g, u) ∩ B is a connected set. It follows from the above properties I and II that we have either L(g, u) ∩ B ⊆ L(g1 , u) ∩ B or L(g, u) ∩ B ⊆ L(g2 , u) ∩ B but not both. Possibly by exchanging g1 and g2 , we can assume L(g, u) ∩ B ⊆ L(g1 , u) ∩ B

(D.12)

without loss of generality. Then, u < min max{g(λ), g2 (λ)} λ∈B

(D.13)

because u < g(λ) for λ ∈ B \ L(g, u) while u < g2 (λ) for λ ∈ L(g, u) ∩ B because L(g2 , u) ∩ B does not overlap L(g1 , u) ∩ B by the property I and therefore does not overlap L(g, u)∩B by (D.12). We can now replace g1 by g and repeat the entire iteration because (D.11) holds by (D.13). It remains to argue that the iteration eventually terminates with the desired g˜. If L(g, u) ∩ B ̸= ∅, we have min g(λ) = λ∈B

min λ∈L(g,u)∩B

(a)

≥

min λ∈L(g,u)∩B

g(λ) [ ] (b) 1 1 ∑ min g1 (λ) + u gi (λ) > 2 2 λ∈B i∈[2]

where (a) is by (D.3) and (b) is because g2 (λ) > u for λ ∈ L(g, u) ∩ B as explained before. The inequality is strict because minλ∈Λ g1 (λ) > −∞ since g1 is real-valued. Rearranging the last inequalities, we have ( ) 1 min g(λ) − u > min g1 (λ) − u λ∈B 2 λ∈B Repeating the iteration with g1 replaced by g eventually makes

A PPENDIX E S UPPORT LEMMA Although the R.H.S. of (D.1) gives us a simpler expression to work with, it is still important to solve the L.H.S. of (D.1) directly for the optimal mixed strategy, i.e. the optimal choice of PQ and g. This can be greatly facilitated by the following cardinality bound on Q derived from the EgglestonCarathéodory theorem [41]. Lemma E.1 (Support Lemma) Consider any function f : ΛA|D × S 7→ R such that f (λ, s) is linear and continuous in λ. Let Nλ be the number of connected components of {f (λ, s) : s ∈ S}. Then, it is admissible to impose the following constraint on the L.H.S. of (D.1), { l if Nλ ≤ l for all λ, and (E.1a) |Q| = k ≤ l + 1 otherwise. (E.1b) where l := 2|V |−|D| −2|V |−|D|−|A| −|V |+|D| is the dimension of ΛA|D plus one. 2 e.g. if f = β˜ and the channel input has the finite-alphabet constraint, then } { ˜ P0 ∏ c Pi ) : P0 ∈ P(XD ), Pi ∈ P(Xi ), i ∈ Dc β(λ, i∈D has only one connected component, i.e. Nλ = 1, because the ˜ ·) is continuous [32, Lemma 2.7] on a connected map β(λ, compact set for every λ ∈ ΛA|D . In this case, (E.1a) is admissible. If the channel input is infinitely-valued, we may use the slightly larger bound (E.1b) instead. P ROOF (P ROOF OF L EMMA E.1) Let λ be the column-vector form of λ, i.e. the i-th element of λ is λ(Bi ) where (B1 , B2 , . . . ) is an arbitrary enumeration of HA|D . Since f (λ, s) is linear in λ, it can be written in the following matrix form in terms of some column vector bs independent of λ, f (λ, s) = b|s λ where b|s denotes the transpose of bs . By definition, λ satisfies λ≥0

and

Mλ = 1

where M is the incidence matrix of the hypergraph HA|D . The entry of M at row i ∈ Dc and column B ∈ HA|D is MiB := χB (i), the indicator of whether i is in B. Note that M is an (|V | − |D|)-by-|HA|D | matrix with full row rank since the columns corresponding to the singleton edges {i} for i ∈ Dc are linearly independent. Thus, solving the above linear equation gives the following solution space for ΛA|D . [ ˆ + Nc = N λ=λ

[ ] ] c ˆ λ 1

where

(E.2)

ˆ B := χ{|B| = 1} for B ∈ HA|D . ˆ is the solution λ · λ · N is the kernel of M , i.e a |HA|D |-by-(l-1) matrix with |H

|

A|D z }| { |V |−|D| |V |−|D|−|A| l := 2 −2 − 1 −|V | + |D| + 1.

28

· c is restricted to the following set to ensure positivity of λ ˆ + N c ≥ 0}. C := {c ∈ Rl−1 : λ Thus, for every (λ, s), we can write [ f (λ, s) = b|s N

B

[ ] ] c ˆ λ 1

for some c ∈ C. Then, by the linearity of expectation, [ ]| [ ] c ˜g(Q) sup min E [f (λ, g(Q))] = sup min E b . 1 c∈C λ g,PQ g,PQ | {z } where

˜| b s

[ := b|s N

] ˆ . Note that is a convex cover of λ

˜s : s ∈ S} ⊆ Rl . X := {b By the Eggleston-Carathéodory theorem [41, p.35], every point in the convex cover is a convex combination of at most l + 1 points in X , and at most l points if in addition that X has at most l components. It follows that restricting |Q| to l + 1 and l points in the respective cases does not change the space of possible values for and so the overall optimization gives the same value as desired.14 A PPENDIX F S ECRECY EXPRESSIONS

A. Definitions For λ ∈ ΛA|D (C.2), distribution PXV , and DMMC PYV |XV , define the following secrecy expression that characterizes the secrecy upper bound in §III.

∑

∑

λB H(XB |XB c )

(F.1a)

λB H(YB c |XB c ) − H(YD |XD )

(∑ ) − λB − 1 H(YV |XV ) B ∑ = λB [H(YB c |XB c ) − H(YD |XD )] B

+

∑

αi (λ, PXV ) = αi (λ, PXV )|PY |X V V ∑ := H(XDc YDc |XD YD ) − λB H(XB YB |XB c YB c ) − H(XDc |XD YD ) +

∑

B

λB H(XB |XB c [YD χB (i)]) (F.4a) ∑ ∑ = λB H(YB c |XB c ) − λB H(YD |XB c ) B B∋i (F.4b) ) (∑ λB − 1 H(YV |XV ) − ∑ B = λB H(YB c \D |XB c YD ) B

+

∑

λB [H(YB c |XB c ) − H(YV |XV )]

λB [H(YB c |XB c ) − H(YV |XV )]

(F.1b) ∀i ∈ Dc (F.1c)

B̸∋i

where (XV , YV ) ∼ PXV PYV |XV . We provide different forms (F.1a), (F.1b) and (F.1c) for the same expression because some properties are easier to see in one form than the other. The equivalence will be proved in Proposition F.1. 14 This also relies on the observation that c can be chosen as a function of instead of (g, PQ ) without loss of optimality.

(F.4c)

B̸∋i

where [YD χB (i)] equals YD if i ∈ B and 0 otherwise. To derive the tightness conditions under which the secrecy lower bound meets the upper bound, define γ(λ, PXV ) = γ(λ, PXV )|PY

V |XV

:= H(XDc YDc |XD YD ) − − H(XDc |XD YD ) +

∑

B∈HA|D

B∋i

To account for the sample average constraint (2.3) in the input, define for any trusted terminal i ∈ Dc

V |XV

λB H(XB YB |XB c YB c ) B∈HA|D

=

B

∑

λB H(XB YB |XB c YB c )

B

:= H(XDc YDc |XD YD ) − − H(XDc |XD ) +

When given a DMMS PXV instead of the DMMC PYV |XV , the secrecy lower bound can be characterized using the following expression instead. ∑ β(λ, PXV ) := H(XDc |XD ) − λB H(XB |XB c ). (F.3)

B∋i

In this section, we will derive some important properties of the expressions that characterize the secrecy upper and lower bounds in §III.

α(λ, PXV ) = α(λ, PXV )|PY

To characterize the secrecy lower bound, define ˜ PX ) = β(λ, ˜ PX ) β(λ, (F.2) V V PYV |XV ∑ := H(XDc YDc |XD YD ) − λB H(XB YB |XB c YB c )

=

∑

∑

λB H(XB |XB c YD )

(F.5a)

B

λB H(YB c \D |XB c YD )

(F.5b) ) (∑ − λB − 1 H(YDc |XV YD ) B ∑ [ ∑ λB H(YB c |XB c ) = λB H(YB c \D |XB c YD ) + B

B∋i

B̸∋i

] − H(YV |XV ) − I(YD ∧ XB |XB c )

(F.5c)

where i ∈ Dc for (F.5c). Proposition F.1 (Equivalence) The different forms (F.1a), (F.1b) and (F.1c) for α are equal, and similarly for αi and γ in (F.4) and (F.5) respectively. 2 P ROOF (F.1b) can be obtained from (F.1a) by H(XB YB |XB c YB c ) − H(XB |XB c ) = H(XV YV ) − H(XB c YB c ) − H(XV ) + H(XB c ) = H(YV |XV ) − H(YB c |XB c )

(F.6)

29

and the same identity with B replaced by Dc , H(XDc YDc |XD YD ) − H(XDc |XD )

Proposition F.3 For any λ ∈ ΛA|D and PXV ∈ P(XV ),

= H(YV |XV ) − H(YD |XD ). Similarly, (F.4b) and (F.5b) can be obtained from (F.4a) and (F.5a) respectively with the additional identity that H(XB YB |XB c YB c ) − H(XB |XB c YD ) = H(XV YV ) − H(XB c YB c ) − H(XV YD ) + H(XB c YD ) = H(YDc |XV YD ) − H(YB c \D |XB c YD ) (F.7) and the same identity with B replaced by D

The following identities relate the different expressions.

c

H(XDc YDc |XD YD ) − H(XDc |XD YD ) = H(YDc |XV YD ). n.b. (F.7) is the same as (F.6) except the conditioning on YD . (F.1b), (F.4b) and (F.5b) can be obtained from (F.1c), (F.4c) and (F.5c) respectively using the constraint (C.2) for∑ fractional ∑ ∑ partitions that B∋i λB = 1. Using B λB − 1 = B̸∋i λB , we can derive the equivalence in the other direction. B. Properties ˜ αi and γ are linear All the secrecy expressions α, β, β, 15 in their first argument λ ∈ ΛA|D , and the marginal PXD of their second argument PXV . In addition, αi and γ are both concave in the input distribution as shown below, while β and β˜ are non-negative by the Shearer-type Lemma (C.3a). Proposition F.2 (Concavity) αi (λ, PXV ) and γ(λ, PXV ) are concave in PXV for all λ ∈ ΛA|D . 2 This implies the following concavity that will be useful in deriving the secrecy upper bound under the sample average constraint (2.3). Corollary F.1 mini∈Dc αi (λ, PXV ) is concave in PXV for all λ ∈ ΛA|D . 2 P ROOF The corollary follows from the concavity of αi and the fact that the minimum of concave functions is concave. To prove Proposition F.2, consider each entropy term in (F.4c) and (F.5b). H(YV |XV ) and H(YDc |XV YD ) are both linear in PXV . The other terms can be expressed in terms of the generalized divergence in (A.7) as follows.16 ˜ X c Y c ∥PX c ) H(YB c |XB c ) = −D(P B B B ˜ X c Y c ∥PX c Y ). H(YB c \D |XB c YD ) = −D(P D B B B The entropy terms are concave in PXV because ˜ ∥g) is convex in the pair (f, g) by the log-sum inequal- D(f ity [23], and - the arguments PXBc YBc , PXBc and PXBc YD are all linear in PXV with PYV |XV fixed. Thus, αi and γ have the desired concavity since they are positively weighted sums of concave functions. 15 Functions that are affine in λ ∈ Λ A|D are also linear in λ because any constant ∑ term can be written as a linear function of λ by the constraint (C.2) that B∋i λB = 1. 16 The definition of D(f ˜ ∥g) in (A.7) requires f and g to share the same domain. To do so, we have implicitly used the trivial extension PZ1 (z1 , z2 ) = PZ1 (z1 ) for all z2 . Since this extension is not stochastic,we use the generalized divergence instead of (A.6).

˜ PX ) − β(λ, PX ) α(λ, PXV ) = β(λ, (F.8a) V V ∑ = αi (λ, PXV ) − λB I(YD ∧ XB c \D |XD ) B∋i (F.8b) [ ] ˜ γ(λ, PXV ) = β(λ, PXV ) − EYD β(λ, PXV |YD (·|YD )) (F.9a) ∑ = αi (λ, PXV ) − λB I(YD ∧ XB |XB c ) (F.9b) B̸∋i

2

P ROOF (F.8a) follows immediately from (F.1b), (F.2) and (F.3). Similarly, (F.9a) follows from (F.5b), (F.2) and (F.3). (F.8b) follows from (F.1c) and (F.4c) since H(YB c \D |XB c YD ) − [H(YB c |XB c ) − H(YD |XD )] = I(YD ∧ XB c \D |XD ). Similarly, (F.9b) follows from (F.5c) and (F.4c).

From these identities, we can derive sufficient conditions under which the secrecy expressions are equivalent, implying that the secrecy bounds are tight. In particular, we consider the following conditions on the channel input distribution PXV and the channel statistics PYV |XV . Conditional independence condition: ∏ P XV = P XD PXi |XD

(F.10)

i∈Dc

Single-leakage condition: The channel PYV |XV satisfies the single-leakage condition that, ∃s ∈ Dc , PYD |XV = PYD |XD∪{s}

(F.11)

which is independent of the input distribution PXV . Roughly speaking, the channel output symbols YD of the untrusted terminals are affected by the input symbol Xs of at most one trusted terminal (hence the name single-leakage) and the input symbols XD of any untrusted terminals. In particular, this is satisfied if D = ∅ or |YD | ≤ 1. Proposition F.4 (Conditions for equivalence) Consider the following equalities for all λ ∈ ΛA|D and s ∈ Dc that (c) (b) ˜ PX ) (a) β(λ, = α(λ, PXV ) = αs (λ, PXV ) = γ(λ, PXV ). (F.12) V

The conditional independence condition (F.10) on the input distribution implies (a), while the single-leakage condition (F.11) on the channel statistics implies (b). (c) holds if |YD | ≤ 1 or if both conditions (F.10) and (F.11) hold. 2 P ROOF To show (a), consider the identity (F.8a). By the conditional independence condition (F.10) on the channel input XV , we have β(λ, PXV ) = 0 by the equality case (C.3b) of the Shearer-type lemma. To show (b), consider the identity (F.9b). By the singleleakage condition (F.11) on the channel, we have 0 = I(YD ∧ XDc \{s} |XD∪{s} ) ≥ I(YD ∧ XB |XB c )

30

for all B ∈ HA|D : B ̸∋ s. the last inequality is because B c ⊇ D ∪ {s}. This implies (b). To show (c), consider the identity (F.8b). The case when |YD | ≤ 1 is trivial. Consider the other case where both (F.10) and (F.11) hold. We have for all B ∈ HA|D : B ∋ s that,

where (a) and (b) follow from the same reasoning as before. The remaining entropy terms can be bounded using (A.4). H(YB c \D |XB c YD ) = H(YLB c \D |XLB c YLD ) ∑ = H(YjB c \D |XLB c YLD Y[j−1]B c \D ) j∈L

I(YD ∧ XB c \D |XD ) ≤ I(Xs YD ∧ XDc \{s} |XD )

≤

= I(Xs ∧ XDc \{s} |XD ) + I(YD ∧ XDc \{s} |XD∪{s} ) | {z } | {z } (i)

=0

where (i) and (ii) follow directly from (F.10) and (F.11) respectively. This gives the desired equality (c). Suppose the DMMC consists of a set of ℓ simultaneous independent channels defined below. Simultaneous independent channels: ∏ PYV |XV = PYjV |XjV

H(YB c |XB c ) = H(YLB c |XLB c ) ∑ = H(YjB c |XLB c Y[j−1]B c ) j∈L

≤

17

(F.13)

A PPENDIX G C OUPLING CHANNEL In this section, we will give the detailed computation for the key rates in Table I for the coupling channel in §IV.

Then, αi and γ satisfy the following maximality of independent input distribution, which is useful in studying the optimality of the secrecy bounds in §V and §VI. Proposition F.5 (Maximality of independent input) Given PYV |XV consists of a set {PYjV |XjV : j ∈ L} of simultaneous independent channels (F.13), we have V |XV

∑

αi (λ, PXjV ) P

YjV |XjV

j∈L

γ(λ, PXLV )|PY

V |XV

≤

∑

j∈L

γ(λ, PXjV

)

PYjV |XjV

(F.14a) (F.14b)

with equality if XjV ’s are independent over j ∈ L.

2

P ROOF We will bound each entropy term in (F.4c) and (F.5b). H(YV |XV ) = H(YLV |XLV ) (a) ∑ = H(YjV |XLV Y[j−1]V ) j∈L

(b) ∑

=

H(YjV |XjV )

j∈L

where (a) is by the chain rule (A.3) and (b) is by the assumption (F.13) of simultaneity and independence of the component channels. Similarly, H(YDc |XV YD ) = H(YLDc |XLV YLD ) (a) ∑ = H(YjDc |XLV YLD Y[j−1]V ) j∈L

(b) ∑

=

H(YjB c |XjB c ).

Substituting these inequalities into (F.4c) and (F.5b) gives (F.14a) and (F.14b).

where L := [ℓ] for some positive integer ℓ.

≤

∑

j∈L

j∈L

αi (λ, PXLV )|PY

H(YjB c \D |XjB c YjD )

j∈L

(ii)

=0

∑

H(YjDc |XjV YjD )

j∈L

17 Simultaneity means no one can observe any channel output symbol until all input symbols are specified.

A. Preliminaries We first carry out some preliminary calculations. With V = [3], A = [2] and D = ∅, the hypergraph HA|D in (A.1) is, H[2]|∅ = {{1}, {2}, {3}, {1, 3}, {2, 3}} Λ[2]|∅ in (C.2) is the convex hull of the following basic fractional partitions.18 λ(1) := λ(2) := λ(3) :=

λ{1}

(k)

λ{2}

(k)

λ{3}

(k)

λ{2,3}

(k)

λ{1,3}

(k)

(1, (0, (1,

0, 1, 1,

0, 0, 1,

1, 0, 0,

0) 1) 0)

(G.1)

Consider an arbitrary joint distribution for the channel input, [ ] [ ] P (0, 0) PX1 X2 (0, 1) a b P X1 X2 = X1 X2 = (G.2) PX1 X2 (1, 0) PX1 X2 (1, 1) c d where a, b, c, d ∈ [0, 1] satisfies a + b + c + d = 1. From the definition (4.1) of the coupling channel, we can compute the entropies H(YB c |XB c ) for B ∈ HA|D ∪ {∅}. The results are summarized in Table II. We will show the computation for the case B = {1} as an example. With h defined as the binary entropy function in (3.2), H(Y2 Y3 |X2 ) = H(N3 |X2 ) + H(Y2 |X2 , N3 ) =1

=H(N2 )=1

z }| { z }| { = H(N3 ) + H(Y2 |X2 = 0, N3 )(a + c) + [H(Y2 |X2 = 1, N3 = 0) + H(Y2 |X2 = 1, N3 = 1)] b+d 2 b = 2(b+d)

}| { [ z = 1 + a + c + h(PX1 N2 |X2 N3 (0, 1|1, 0)) ] + h(PX1 N2 |X2 N3 (0, 0|1, 1)) b+d {z } 2 | b = 2(b+d)

31

TABLE II E NTROPY TERMS FOR COUPLING CHANNEL WITH CORRELATED INPUT (G.2)

B

H(YB c |XB c )

∅

H(Y123 |X12 )†

{1}

H(Y23 |X2 )

= 1 + a + c + (b + d)h

{2} {3} {1, 3} {2, 3}

H(Y13 |X1 ) H(Y12 |X12 ) H(Y2 |X2 ) H(Y1 |X1 )

= 1 + c + d + (a + b)h =2 =1 =1

†

=2+b+c

(

)

b ( 2(b+d) ) b 2(a+b)

Y123 is short for (Y1 , Y2 , Y3 ) and similarly for others.

α(λ

, PXV ) = H(Y23 |X2 ) + H(Y1 |X1 ) − H(Y123 |X12 ) ) ( b (G.3a) = a − b + (b + d)h 2(b + d)

α(λ(2) , PXV ) = H(Y13 |X1 ) + H(Y2 |X2 ) − H(Y123 |X12 ) ( ) b = d − b + (a + b)h (G.3b) 2(a + b) α(λ(3) , PXV ) = H(Y23 |X2 ) + H(Y13 |X1 ) + H(Y12 |X12 ) − 2H(Y123 |X12 ) ( ) b = a + d − 2b + (b + d)h 2(b + d) ( ) b + (a + b)h (G.3c) 2(a + b) From these, we have the equality that α(λ(3) , PXV ) = α(λ(1) , PXV ) + α(λ(2) , PXV )

(G.4)

Consider the independence constraint on the input distribution that for some p1 , p2 ∈ [0, 1], PX1 = Bernp1 PX2 = Bernp2

(G.5a) (G.5b)

PX1 X2 = PX1 PX2 .

(G.5c)

With this, the entropy terms in Table II become those in Table III, obtained using the following substitution. [ ] [ ] a b (1 − p1 )(1 − p2 ) (1 − p1 )p2 (G.6) = c d p1 (1 − p2 ) p1 p2 With independent input, α is equal to β˜ by equivalence relation (a) of Proposition F.4. Thus, we have from (G.3) that (

1 − p1 2 (p ) 2 (2) ˜ β(λ , PXV ) = p2 (2p1 − 1) + (1 − p1 )h 2 ˜ (3) , PX ) = β(λ ˜ (1) , PX ) + β(λ ˜ (2) , PX ). β(λ V V V ˜ (1) , PX ) = (1 − p1 )(1 − 2p2 ) + p2 h β(λ V

COUPLING CHANNEL WITH INDEPENDENT INPUT (G.5)

B

H(YB c |XB c )

∅ {1} {2} {3} {1, 3} {2, 3}

H(Y123 |X12 ) H(Y23 |X2 ) H(Y13 |X1 ) H(Y12 |X12 ) H(Y2 |X2 ) H(Y1 |X1 )

= 2 + p1 (1 − p2() + p2)(1 − p1 ) 1 = 2 − p2 + p2 h 1−p 2 ( ) = 1 + p1 + (1 − p1 )h p22 =2 =1 =1

TABLE IV O PTIMAL PUBLIC MESSAGE RATES FOR COUPLING CHANNEL

which is the desired expression for B = {1} in Table II. Using the result in Table II, we evaluate α for each of the basic fractional partitions in (G.1) as follows. (1)

TABLE III E NTROPY TERMS FOR

) (G.7a) (G.7b) (G.7c)

18 This is the same as the fractional partitions illustrated in Fig. 6 since the two hypergraphs have the same set of edges.

optimal public message rate† (ri ) [ ( )] p2 + h(p) + (1 − p) 2 − p − h 1−p 2

user (i) 1, 2

≈ 1.578 ( ) p + (1 − p)h 1−p ≈ 0.919 2

3 †

p ≈ 0.44 is the optimal solution to (e) in (G.8).

B. Optimal pure source emulation In the pure source emulation approach, user 1 and user 2 transmit independent inputs iid over time, i.e. ∏ PXn1 Xn2 (xn1 , xn2 ) = PX1 (x1 t )PX2 (x2 t ). t∈[n]

Then, each user i ∈ [3] broadcasts a public message at rate ri such that user 1 and user 2 can recover the entire channel input and output sequences (Xn1 , Xn2 , Y1n , Y2n , Y3n ). By minimizing ∑ the sum rate i∈[3] ri , we maximizes the asymptotic rate of the extractible key independent of the public messages [9]. The maximum key rate is, (a)

Rpse =

(b)

=

(c)

=

max

˜ PX X ) min β(λ, 1 2

max

˜ (l) , PX X ) min β(λ 1 2

PX1 X2 =PX1 PX2 λ∈Λ[2]|∅ PX1 X2 =PX1 PX2 k∈[2]

max

p1 ,p2 ∈[0,1]

z (d)

min{g(p1 , p2 ), g(1 − p2 , 1 − p1 )} g(p,1−p)

}|

= max (1 − p)(p − 1) + (1 − p)h p∈[0,1]

(

1−p 2

(G.8)

){

(e)

≈ 0.41 ( ) 1 where g(p1 , p2 ) := (1 − p1 )(1 − 2p2 ) + p2 h 1−p . 2 (a) The equality follows from (3.15c) with Q deterministic. ˜ PX X ) is a linear program, the optimal (b) Since minλ β(λ, 1 2 value is achieved at some basic fractional partition in (G.1). We need not consider λ(3) since it cannot achieve a smaller value than λ(1) (or λ(2) ) by (G.7c) and the positivity of β˜ by (C.3a) of the Shearer-type lemma. (c) This is by (G.7a) and (G.7b) under (G.5). (d) The maximum is achieved at p1 = 1 − p2 . (See Fig. 7(a).) (e) The maximum is achieved at p ≈ 0.44. (See Fig. 7(b).)

32

where

0.4 0.3

g ′ (p0 , 1 − p20 , 1 − p10 , 1 − p21 , 1 − p22 )

0.4

0.2 0.1

:= (1 − p0 )g(p10 , p20 ) + p0 g(p11 , p21 ).

0.3

0 0.2 0.4 p1 0.6 0.8

(a) The equality follows from (3.15a). (b) same reason as (b) of (G.8). (c) by (G.7a) and (G.7b), averaged over a binary auxiliary component source with the conditional input distribution

0.2

1 0

0.6 0.4 p 0.2 2

0.8

1

0.1

0 0.2 0.4 0.6 0.8 1 p

(a) min{g(p1 , p2 ), g(1 − p2 , 1 − p1 )}

Fig. 7. Optimal input distribution for the pure source emulation scheme: PX1 X2 (x1 , x2 ) = Bernp1 (x1 ) Bernp2 (x2 ) with p1 = 1 − p2 = p ≈ 0.44.

Although we do not know the optimal choice of the key and public discussion functions, we can compute the optimal choice of the public message rates r := (ri : i ∈ [3]) shown in Table IV as follows. By the strong duality theorem [42], ∑ ˜ PX ) = H(X[2] Y[3] ) − min min β(λ, ri (G.9) V r

λ

PQ := Bernp0 PXi |Q (·|q) := Bernpi

(b) g(p, 1 − p)

i∈[3]

where for all B ∈ H[2]|∅ , ∑ ri ≥ H(XB YB |XB c YB c )

By the Support Lemma E.1, it does not lose optimality to choose Q binary as there are only two choices for the basic fractional partitions in (b). (d) The maximum is achieved at ( 2) ( ) p0 = 12 , (p10 , p20 ) = 0, 17 and (p11 , p21 ) = 15 17 , 1 . It can be obtained using a global maximization algorithm. We used the shuffled complex evolution in [43]. We can arrive at the same answer using the alternative form of Rmse from (3.15b). Rmse = min

(G.10)

i∈B

= H(Y[3] |X[2] ) − H(YB c |XB c ) + H(XB ). By the equality (d) in (G.8), λ(1) and λ(2) are both optimal solutions to the L.H.S. of (G.9). Applying the complementary slackness theorem [42], we have for any optimal solution r (1) (2) that λB > 0 or λB > 0 implies equality in (G.10) for the particular B ∈ H[2]|∅ . This gives a set of equations, from which we can solve for the optimal rates as given in Table IV.

The computation for the mixed source emulation approach proceeds in the same way as the pure source emulation approach described in the previous section except that user 1 and user 2 transmit conditionally independent input for a chosen public auxiliary component source Q, i.e. PQn ,Xn1 ,Xn2 (q n , xn1 , xn2 ) ∏ = PQ (qt )PX1 |Q (x1 t |qt )PX2 |Q (x2 t |qt ). t∈[n]

The maximum key rate is, (a)

Rmse =

[ ] ˜ PX ,X |Q (·|Q)) min E β(λ, 1 2

max

PQ,X1 ,X2 λ∈Λ[2]|∅ =PQ PX1 |Q PX2 |Q

(b)

˜ (l) , PQ,X ,X ) = max min β(λ 1 2 PQ,X1 ,X2 k∈[2]

(c)

=

max

p0 ,piq ∈[0,1]: i∈[2],q∈{0,1} g ′ (p0 , 1

(d) 1 = 2 (log 17

{ min g ′ (p0 , p10 , p20 , p11 , p21 ), − p20 , 1 − p10 , 1 − p21 , 1 − p22 )

− 3) ≈ 0.54

(G.11) }

˜ PX ,X ) β(λ, 1 2

max

λ∈Λ[2]|∅ PX1 ,X2 =PPX

P 1 X2

(a)

(L) ˜ = min max β(E(λ ), PX1 ,X2 ) PL PX1 ,X2 [ ] (b) ˜ (L) , PX ,X ) = min max E β(λ 1 2

, L ∈ [3]

PL PX1 ,X2

[ ˜ (1) , PX ,X ) = min max (PL (1) + PL (3))β(λ 1 2 (c)

PL PX1 ,X2

˜ (2) , PX ,X ) + (PL (2) + PL (3))β(λ 1 2

(G.12) ]

[ ] ˜ (1) , PX ,X ) + θβ(λ ˜ (2) , PX ,X ) = min max (1 − θ)β(λ 1 2 1 2

(d)

θ

C. Optimal mixed source emulation

i ∈ [2], q ∈ {0, 1}

PX1 ,X2

g ′′ (θ, p1 , p2 )

= min

max

(e) 1 = 2 (log 17

− 3) ≈ 0.54

θ∈[0,1] p1 ,p2 ∈[0,1]

where g ′′ (θ, p1 , p2 ) := θg(p1 , p2 ) + (1 − θ)g(1 − p2 , 1 − p1 ). (a) because every λ ∈ ΛA|D is a convex combination E(λ(L) ) of the basic fractional partitions in (G.1). (b) by the linearity of expectation and β(λ, P˜X1 X2 ) in λ. (c) by (G.7c). (d) Since β˜ is non-negative, it is optimal to choose PL (3) = 0. (e) Using [43], it can be shown that the maximum is achieved by choosing θ = 12 . The ( corresponding ) ( )optimal choice of 2 (p1 , p2 ) for θ = 12 is 0, 17 or 15 , 1 . 17 Unlike the previous optimization, the optimal input distribution for the mixed source emulation is not immediately available from the optimal solutions in the current optimization. This is because the operational meanings of the optimal solutions are changed when we apply the minimax-type lemma to obtain the current optimization from the previous one. While the current optimization involves two less parameters than the previous case, it is a minimax problem, rather than a pure maximization problem in the previous case.

33

with the constraint that a, b, c, d ∈ [0, 1] : a + b + c + d = 1. For the consensus channel PY|X1 X2 defined in (7.1), we have

D. Secrecy upper bound By (3.8), the secrecy upper bound is

H(Y|X1 X2 ) = H(Y|X1 X2 χ{X1 = X2 })

Csu = min max α(λ, PX1 X2 ) λ∈Λ[2]|∅ PX1 X2

=0

z }| { = Pr{X1 = X2 } H(Y|X1 X2 , X1 = X2 ) + Pr{X1 ̸= X2 } H(Y|X1 X2 , X1 ̸= X2 ) | {z }

(a)

= min max α(E(λ(L) ), PX1 X2 ) PL PX1 X2 [ ] (b) = min max E α(λ(L) , PX1 X2 ) PL PX1 X2

(c)

= min

max

PL a,b,c,d∈[0,1]: a+b+c+d=1

(i)

] + (PL (2) + PL (3))f (d, b, c, a) (d) 1 =2

=H(N)=1

[ (PL (1) + PL (3))f (a, b, c, d)

= b + c, H(Y|X1 ) =

∑ x1 ∈{0,1}

log 7 − 2 ≈ 0.60

(ii)

(

b where f (a, b, c, d) := a − b + (b + d)h 2(b+d) . (a) same reason as (a) in (G.12). (b) by the linearity of expectation and α(λ, [ P]X1 X2 ) in λ. (c) by (G.3) and (G.4), setting PX1 X2 = ac db . (d) Using [43], f (a, b, c, d) is maximized at

3 1 3 (a, b, c, d) = ( , , 0, ) 7 7 7

PX1 (x1 )H(Y|X1 = x1 ) (

= (a + b)h |

)

(H.2a)

b 2(a+b)

)

( + (c + d)h {z

+ pf (a1 , b1 , c1 , d1 ),

} (1 − p)f (d0 , b0 , c0 , a0 ) + pf (d1 , b1 , c1 , a1 )

}

H(Y|X2 ) = f (a, c, b, d).

(H.2c)

(i) This follows from the definition (7.1) of Y. (ii) Given X1 = 0, we have Y = 1 iff X2 = N = 1 by (7.1), which occurs with probability b 2(a + b)

by independence. Thus, H(Y|X1 = 0) = h(δ2 /2) Similarly, we have H(Y|X1 = 1) = h((1 − δ2 )/2). (iii) This is by the symmetry of the consensus channel that PY|X1 X2 (y|x1 , x2 ) = PY|X1 X2 (y|x2 , x1 ). Suppose we have X1 independent of X2 instead, with PX1 = Bernδ1

and

PX2 = Bernδ2 .

[

PQ = Bernp and PX1 X2 |Q (·|q) =

aq cq

bq dq

(H.3)

Then, it follows that H(Y|X1 X2 ) = δ1 (1 − δ2 ) + δ2 (1 − δ1 ) ( ) ( ) 1 − δ2 δ2 + δ1 h H(Y|X1 ) = (1 − δ1 )h 2 2 | {z }

(H.4a) (H.4b)

g(δ1 ,δ2 )

H(Y|X2 ) = g(δ2 , δ1 ).

where we have chosen

(H.2b)

f (a,b,c,d)

PX2 |X1 (1|0)PN (1) =

p,a0 ,b0 ,c0 ,d0 ,a1 ,b1 ,c1 ,d1 : aq +bq +cq +dq =1,∀q∈{0,1}

)

(iii)

(G.13)

under the constraint that (a, b, c, d) is stochastic. Since a = d in this case, we have f (d, b, c, a) = f (a, b, c, d), which is also maximized. Thus, (G.13) is the optimal solution for every choice of PL . The optimal choice of PL must have PL (3) = 0. However, PL (1) and PL (2) can be arbitrary since f (d, b, c, a) = f (a, b, c, d) optimally. Alternatively as before, we can turn the minimax problem into a maximization problem by the minimax-type lemma, [ ] Csu = max min E α(λ, PX1 X2 |Q (·|Q)) PQ,X1 ,X2 λ∈ΛA|D { = max min (1 − p)f (a0 , b0 , c0 , d0 )

c 2(c+d)

(H.4c)

] ∀q ∈ {0, 1}.

B. Computation of secrecy bounds By Theorem 3.3, we have the secrecy lower bound,

Using [43], we obtain the same upper bound. Rse := min

max

λ∈Λ[2]|∅ PX1 X2 =PX1 PX2

A PPENDIX H

(a)

= max [H(Y|X1 ) + H(Y|X2 ) − H(Y|X1 X2 )]

C ONSENSUS CHANNEL

PX1 X2

In this section, we will give the detailed computation for the secret key rates of the consensus channel considered in §VII. A. Preliminaries Let the input distribution be [ ] [ ] PX1 X2 (0, 0) PX1 X2 (0, 1) a b PX1 X2 := := PX1 X2 (1, 0) PX1 X2 (1, 1) c d

˜ PX X ) β(λ, 1 2

(H.1)

(b)

= max

δ1 ,δ2 ∈[0,1]

(c) 7 =2

−

3 2

[g(δ1 , δ2 ) + g(δ2 , δ1 ) − δ1 (1 − δ2 ) − δ2 (1 − δ1 )]

log 3 ≈ 1.12

(H.5)

( ) ( ) 2 where g(δ1 , δ2 ) := (1 − δ1 )h δ22 + δ1 h 1−δ . 2 (a) because there is only one possible fractional partition. (b) Let X1 and X2 be independent random variables distributed as in (H.3). Then, (b) follows from (H.4). (c) The maximum is uniquely achieved at δ1 = δ2 = 12 .

34

n.b. pure and mixed source emulations achieve the same maximum key rate, primarily because the minimization over the fractional partition is trivial with only one possible choice. By Theorem 3.1, the secrecy upper bound is Csu := min max α(λ, PX1 X2 ) λ∈Λ[2]|∅ PX1 X2

(a)

= max [H(Y|X1 ) + H(Y|X2 ) − H(Y|X1 X2 )] PX1 X2

(b)

=

max a,b,c,d∈[0,1]: a+b+c+d=1

[f (a, b, c, d) + f (a, c, b, d) − (b + c)]

(c)

= 2 log 3 − 2 ≈ 1.17

(H.6) ) b c where f (a, b, c, d) := (a + b)h 2(a+b) + (c + d)h 2(c+d) . (a) because there is only one possible fractional partition. (b) (b) is by (H.2) with X1 and X2 distributed as (H.1). (c) The maximum is achieved at a = d = 16 and b = c = 31 . (

)

(

ACKNOWLEDGMENT The authors are very grateful of the detailed reviews and many helpful suggestions. R EFERENCES [1] C. Chan, publications. http://chungc.net63.net/pub, http://goo.gl/4YZLT. [2] ——, “Generating secret in a network,” Ph.D. dissertation, Massachusetts Institute of Technology, 2010, see [1]. [3] C. E. Shannon, “Communication theory of secrecy systems,” Bell System Technical Journal, vol. 28, no. 4, pp. 656–715, 1949. [4] A. D. Wyner, “The wire-tap channel,” Bell System Technical Journal, vol. 54, no. 8, pp. 1355–1387, 1975. [5] I. Csiszár and J. Körner, “Broadcast channels with confidential messages,” IEEE Transactions on Information Theory, vol. 24, no. 3, pp. 339–348, May 1978. [6] C. H. Bennett, F. Bessette, G. Brassard, L. Salvail, and J. Smolin, “Experimental quantum cryptography,” J. Cryptol., vol. 5, no. 1, pp. 3–28, Jan. 1992. [7] U. M. Maurer, “Secret key agreement by public discussion from common informat ion,” IEEE Transactions on Information Theory, vol. 39, no. 3, pp. 733–742, 1993. [8] R. Ahlswede and I. Csiszár, “Common randomness in information theory and cryptography—Part I: Secret sharing,” IEEE Transactions on Information Theory, vol. 39, no. 4, pp. 1121–1132, Jul. 1993. [9] I. Csiszár and P. Narayan, “Secrecy capacities for multiple terminals,” IEEE Transactions on Information Theory, vol. 50, no. 12, Dec. 2004. [10] C. Chan and L. Zheng, “Mutual dependence for secret key agreement,” in Proceedings of 44th Annual Conference on Information Sciences and Systems, 2010, see [1]. [11] C. Chan, “The hidden flow of information,” in 2011 IEEE International Symposium on Information Theory Proceedings (ISIT2011), St. Petersburg, Russia, Jul. 2011, see [1]. [12] ——, “Matroidal undirected network,” submitted to the 2012 Information Theory and Applications Workshop and the 2012 IEEE International Symposium on Information Theory. See [1]. [13] U. M. Maurer, “Perfect cryptographic security from partially independent channels,” in Proc. 23rd ACM Symposium on Theory of Computing, 1991, pp. 561–571. [14] I. Csiszár and P. Narayan, “Secrecy capacities for multiterminal channel models,” IEEE Transactions on Information Theory, vol. 54, no. 6, pp. 2437–2452, Jun. 2008. [15] R. W. Yeung, Information Theory and Network Coding. Springer, 2008. [16] J. von Neumann and O. Morgenstern, Theory of Games and Economic Behavior, 3rd ed. Princeton University Press, 1953. [17] I. Csiszár and P. Narayan, “Secrecy generation for multiple input multiple output channel models,” in Proceedings of 2009 IEEE International Symposium on Information Theory, Jun. 2009, pp. 2447–2451. [18] C. Chan, “Linear perfect secret key agreement,” in 2011 IEEE Information Theory Workshop Proceedings (ITW2011), Paraty, Brazil, Oct. 2011, see [1].

[19] ——, “Delay of linear perfect secret key agreement,” in Communication, Control, and Computing (Allerton), 2011 49th Annual Allerton Conference on, Sep. 2011, pp. 1128 –1135, see [1]. [20] ——, “Universal secure network coding by non-linear secret key agreement,” submitted to NETCOD 2012, see [1]. [21] ——, “Universal secure network coding by non-linear precoding,” submitted to ITW 2012, see [1]. [22] C. Nair, B. Prabhakar, and D. Shah, “On entropy for mixtures of discrete and continuous variables,” CoRR, vol. abs/cs/0607075, 2006. [23] T. M. Cover and J. A. Thomas, Elements of Information Theory. WileyInterscience Publication, 1991. [24] C. Chan, “Multiterminal secure source coding for a common secret source,” in Forty-Ninth Annual Allerton Conference on Communication, Control, and Computing, Allerton Retreat Center, Monticello, Illinois, Sep. 2011. [25] ——, “Agreement of a restricted secret key,” see [1]. [26] C. H. Bennett, G. Brassard, C. Crépeau, and U. M. Maurer, “Generalized privacy amplification,” IEEE Transactions on Information Theory, vol. 41, no. 6, pp. 1915–1923, Nov. 1995. [27] M. Hayashi, “Exponential decreasing rate of leaked information in universal random privacy amplification,” Information Theory, IEEE Transactions on, vol. 57, no. 6, pp. 3989 –4001, june 2011. [28] S. Nitinawarat and P. Narayan, “Perfect omniscience, perfect secrecy, and steiner tree packing,” IEEE Transactions on Information Theory, vol. 56, no. 12, pp. 6490–6500, Dec. 2010. [29] C. H. Bennett, G. Brassard, and J.-M. Robert, “Privacy amplification by public discussion,” SIAM Journal on Computing, vol. 17, no. 2, pp. 210–229, Apr. 1988. [30] A. Gohari and V. Anantharam, “Information-theoretic key agreement of multiple terminals—Part I,” Information Theory, IEEE Transactions on, vol. 56, no. 8, pp. 3973 –3996, Aug. 2010. [31] ——, “Information-theoretic key agreement of multiple terminals—Part II: Channel model,” Information Theory, IEEE Transactions on, vol. 56, no. 8, pp. 3997 –4010, Aug. 2010. [32] I. Csiszár and J. Körner, Information Theory: Coding Theorems for Discrete Memoryless Systems. Akadémiai Kiadó, Budapest, 1981. [33] C. E. Shannon, “A mathematical theory of communication,” Bell System Technical Journal, vol. 27, pp. 379–423, 623–656, Oct. 1984. [34] M. Sion, “On general minimax theorems,” Pacific Journal of Mathematics, vol. 8, no. 1, pp. 171–176, 1958. [35] A. S. Avestimehr, S. N. Diggavi, and D. N. C. Tse, “Wireless network information flow: A deterministic approach,” CoRR, vol. abs/cs/0906.5394, 2009. [36] P. Gács and J. Körner, “Common information is far less than mutual information,” Problems of Control and Information Theory, vol. 2, no. 2, pp. 149–162, Feb. 1972. [37] F. Jones, Lebesgue Integration on Euclidean Spaces, revised ed. Jones and Barlett Publishers, 2000. [38] D. N. Politis, “On the entropy of a mixture distribution,” Department of Statistics, Purdue University, Tech. Rep. 91-67, Nov. 1991. [39] M. Madiman and P. Tetali, “Information inequalities for joint distributions, with interpretations and applications,” IEEE Transactions of Information Theory, 2008, to appear. [40] F. Terkelsen, “Some minimax theorems,” Mathematica Scandinavica, vol. 31, pp. 405–413, 1972. [41] H. G. Eggleston, Convexity. Cambridge University Press, 1966. [42] G. B. Dantzig and M. N. Thapa, Linear Programming. 1: Introduction. Springer-Verlag New York, 1997-2003. [43] B. Donckels, “Matlab implementation of shuffled complex evolution,” http://biomath.ugent.be/~brecht/download/SCE.zip, 2006.

Mutual Dependence for Secret Key Agreement