On Uplink Virtual MIMO with Device Relaying Cooperation ...

Viewer
Transcript

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2017.2707540, IEEE Transactions on Mobile Computing 1

On Uplink Virtual MIMO with Device Relaying Cooperation Enforcement in 5G Networks Mehdi Naderi Soorki, Mohammad Hossein Manshaei, Behrouz Maham, and Hossein Saidi Abstract—In this paper, a novel protocol is proposed in which mobile terminals (MT) form a virtual Multiple-input Multiple-output (MIMO) uplink by means of device relaying on Device to Device (D2D) tier in 5G Cellular Network. The competitive scenario is considered in which each of the selfish MTs tries to transmit its own data and not relay others’ data in the formed virtual MIMO. The main focus is to design an incentive for MTs to form the virtual MIMO and cooperate in relaying others data. A direct revelation on-line mechanism for the BS is designed, in order to assist forming a stable virtual MIMO. A self-punishment mechanism is also proposed in which MTs autonomously punish malicious MTs that do not cooperate in relaying. We prove that our designed direct revelation on-line mechanism and proposed self-punishment mechanism enforce all-cooperation (all-C) profile as a Nash equilibrium (NE), under uncertainty in the presence of MTs in the formed virtual MIMO. Our simulation results confirm that the proposed protocol, even in the competitive scenario, increases the bit rate and decreases power consumption at the same time. The proposed protocol can improve the energy efficiency up to 35% compared to a non-cooperative case, i.e., Single-Input Multiple-Output (SIMO) uplink. Moreover, if the multi-user MIMO transmission is used for the uplink medium access layer, the proposed protocol can improve the energy efficiency up to 42% compared to SIMO uplink with multi-user MIMO transmission. Under the proposed OCVM protocol with Shapley value fairness, the price of anarchy reaches to 0.78 in the competitive scenario. In addition, the energy efficiency improvement of our proposed protocol is almost robust to the preferences of MTs. Simulation results show that if BS employs our on-line mechanism and MTs autonomously punish malicious MTs, the malicious MTs cannot gain by defecting from relaying other MTs’ data. Index Terms—5G cellular network, virtual MIMO, D2D relaying, coalitional game, mechanism design.

F

1

I NTRODUCTION

M

ULTI -antenna systems have been proposed as an effective way to improve both capacity and energy efficiency in 5G cellular network. It has been shown in [1] that MIMO systems require less transmitted power than single antenna systems to achieve the same data rate requirements. Although the BS can be equipped with multiple antennas, MTs could not always support more than one single antenna due to physical constraints [2], [3]. The fixed terminal relaying brings improvements in cellular systems, but this feature will be extended by the full potential of cooperation through the implementation of device relaying in the D2D tier of the multi-tier 5G network [4], [5]. The device relaying makes it possible for devices in a network to function as relays for each other [4]. In uplink, there is one transmitting antenna for each MT and two or more receiving antennas for the BS. Therefore, if we allow single antenna MTs to cooperate with each others on information transmission at the same sub-channel, a virtual MIMO link can be constructed by device relaying [1], [2]. Consequently, not only MTs increase their data rate, but also the BS can save resources in terms of sub-channels. Designing an efficient virtual MIMO uplink by device relaying faces numerous technical challenges in the D2D tier

•

•

M. Naderi Soorki, M.H. Manshaei and H. Saidi are with the Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan 84156-83111, Iran. E-mails: [email protected]; {manshaei, hsaidi}@cc.iut.ac.ir B. Maham is with the School of Engineering, Nazarbayev University, Astana, Kazakhstan. Email: [email protected].

Manuscript received 14 Sep. 2015; revised 17 Sep. 2016; second revised 28 Feb. 2017, accepted 8 May 2017.

of 5G cellular network such as: resource allocation and interference management, privacy of user data, and persuading devices to participate in this type of communication [4], [6]. There exist some distributed algorithms that allow single antenna users autonomously decide when to cooperate and with whom to cooperate to form virtual MIMO such as in [7]–[9]. In [7], Saad et al. consider power consumption as a cost of cooperation, model the problem of virtual MIMO formation as a coalition formation game and use merge-andsplit algorithm to find out the formation of MIMO groups. Fairness criteria are also considered to determine the time period of acting as a source in each formed coalition. In [8], Ramirez et al. propose stable marriage framework for distributed virtual MIMO coalition formation in which relays propose and MTs select their desire relay nodes. They assume that in addition to the mobile users and the BS, there exist some relay nodes in the network model to relay MTs data without any payment. They have shown that under certain conditions forming a virtual MIMO channel does not provide an energy efficient solution, especially when power consumption of D2D cooperation link between relay node and MT is high. In [9], A non-transferable utility (NTU) game is developed to analyze the behaviors of the vehicles and roadside units to form virtual MIMO in vehicular Networks. In each formed coalition by vehicles and roadside units, the vehicles would share the channel with each other in TDMA mode, and they need to pay for the roadside units relaying. In [10], a new group-based pairing method called utility group pairing (UGP) is proposed to from virtual MIMO transmitting data by MTs. It considers SNR and channel state to form each group. Then, it uses round robin scheduling within each formed pair to increase the through-

1536-1233 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2017.2707540, IEEE Transactions on Mobile Computing 2

put for higher SNR users. Moreover, a scheduling algorithm in LTE uplink, exploiting virtual MIMO is proposed in [11]. The objective of the algorithm is to improve the sector throughput by assigning appropriate resource block (RB). The existing works that consider full cooperation of MTs do not assume selfish behavior of MTs in relaying scenarios other MTs’ data (e.g., [7], [8], [10]). Some of the previous works such as [10] just consider the total throughput of the network and do not consider user power consumption of MTs in forming virtual MIMO. Other works such as [8] that assume some relaying nodes under control of BS across the network, do not use full device relaying of MTs in the designing the virtual MIMO formation by MTs. Moreover, any pricing mechanism suggested in [9] to provide incentive for relaying other MTs signal increases the signaling overhead specially when the mobility patterns of MTs are not known. In contrast to these existing works, our aim in this paper is considering unknown mobility patterns and selfish behavior of MTs then applying full device relaying in the virtual MIMO formation. In addition, we consider both of the throughput and energy consumption as the preferences of MTs and we use self-punishment mechanisms to provide incentive in relying. The main contribution of this paper is to propose a novel Online Collaborative Virtual MIMO (OCVM) protocol, for persuading selfish MTs to form uplink virtual MIMO in D2D tier and enforce cooperation among them to relay others’ data. We model the problem using a gametheoretical framework, where we consider both bit rate and battery consumption of MTs as their preferences. We design a direct revelation on-line mechanism for OCVM protocol, in which MTs should reveal their duration of presence in the virtual MIMO coalition. The goal of on-line mechanism is to increase total uploaded data by MTs, persuade them to participate in virtual MIMO and guarantee fairness among them. After forming virtual MIMO, we design a self-punish mechanism which enforces MTs to reveal their duration of presence in the coalition truthfully and cooperate in the formed virtual MIMO. We will show that our proposed OCVM protocol provides all-cooperation (all-C) profile as a Nash equilibrium (NE) under uncertainty about the MTs’ duration of presence in the coalition. The main continuations of this paper are summarized as follows: • We use both of coalitional and noncooperative games to design a novel virtual MIMO formation by autonomously device relaying. We apply the coalitional game to find out the formed cooperative virtual MIMO groups and also we use the noncooperative game to design self-punishment mechanism in each coalition. • Without a-priori knowledge of mobility pattern of MTs, we use the uncertainty about the MTs’ duration of presence of MTs in the coalition to motivate the selfish MTs to form stable coalitions and relay others’s data in each coalition. According to the best of our knowledge, it is the first time that the unknown mobility pattern of MTs is applied to enforce cooperation between MTs in wireless applications. • We design an online and truthful mechanism that forces the MTs truthfully reveal the duration of their duration of presence in the coalitions to the BS. According to the best of our knowledge, it is the first time that the BS is involved as a mediator to form virtual MIMO by selfish MTs under

uncertainty about mobility pattern and provide capability of selfpunishment for them. This OCVM protocol can be used in the D2D tier of 5G cellular network specially in the dense area where a large number of MTs are located within a small area such as a classroom, a cafe, an airport or a mall. In this case the MTs are in the short range of each others and could cooperate to upload the data of themselves. The rest of this paper is organized as follows. In Section 2, we present the system model for cooperative virtual MIMO formation under uncertainty by selfish mobile terminals, after that the MTs preferences is formulated. In Section 3, we discuss about the cooperation under uncertainty and analyze the game in the formed OCVM coalition. In Section 4, we propose our OCVM protocol in details. The complexity and signaling of proposed OCVM protocol is discussed in Section 5. Simulation results are presented in Section 6 following by conclusions and future directions in Section 7.

2

S YSTEM M ODEL

Consider a cellular network composed of one multiple antenna base station (BS), and a set K of K MTs that seek to form an OCVM coalitions to increase their uplinks’ data rate. Each MTk is selfish and tries to increase its bit rate and decrease its power consumption regardless of other members in the coalition. All MTs can communicate on two communication links: short range D2D link and long range cellular link. The MTs use the D2D link to relay data between themselves in D2D tier. Each of MTs which is equipped with a single antenna transmits its data on the cellular uplink to the multiple antenna BS. In this study, we consider infinitely discrete time slot set Tset , where each of time slot t has a fixed duration T . The users’ locations and their duration of presence at those locations are generally unpredictable for the BS. But, each user can estimate the time duration that wants to spend at specific location near other MTs, specially for the scenario that mobile users have low speed, such as coffee shops, airports, work offices, and campus. Thus, we define the type of MTk which is known for each MT as an ordered pair τk = (ak , dk ) ∈ Tset × Tset , where ak and dk are respectively the arrival and departure times of MTk in the OCVM coalition. Let G(t) = {G1 , G2 ..., G|G(t)| } denotes the set of coalitions autonomously formed by single antenna MTs at time slot t. The coalition Gi has a virtual AtT x × ARx MIMO uplink, in which ARx is number of antenna at the BS and AtT x = |Gi | is equal to number of MTs in the coalition Gi at time slot t. The system bandwidth, B (Hz), is divided into X RBs and each RB has Isub subcarriers. In the uplink, we use the SC-FDMA such as in LTE-Advanced. Let assume one RB is assigned to each OCVM coalition to avoid cochannel interference (CCI). We consider the MTs share the bandwidth of MIMO uplink using time division multiple access (TDMA) between themselves after forming OCVM coalition. Thus, during each time slot, each MTk in each coalition P is selected as the source for a specific period θk T where k∈Gi θk = 1. The coefficients θk and the order of source selections are determined by the OCVM protocol. Then, the source transmits its data on cellular link to the BS and simultaneously multicasts it on D2D links to other MTs in the coalition in one hop. In

1536-1233 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2017.2707540, IEEE Transactions on Mobile Computing 3 Multi antenna Base station

MT4

MT5

LR SIMO LR‐SIMO MT6

LR‐virtual MIMO

Arrival MT8

LR SIMO LR‐SIMO MT3 :Relay

multi member multi‐member OCVM coalition

SR

Departure MT7

SR MT2 :Source

MT1 :Relay

Fig. 3. An illustration of finitely dynamic relaying game (FDRG) during one time slot.

Fig. 1. An illustration of OCVM coalition protocol.

t=i a1

d1

FSRG1

MT1 a2 MT2

d2

FSRG2 a3

MT3

TABLE 1 List of Symbols.

time

d3

Finitely and Stochastically Repeated Game (FSRG3)

Finite Dynamic Relaying Game (FDRG) ( Cooperation under uncertainty)

Symbol OCVM FDRG FSRG K τk ak dk Gi k RSIMO RD,Gi Gi RMIMO k,t ESIMO

,∗ ptkj

Definition Online Collaborative Virtual MIMO Finite Dynamic Relaying Game Finite Stochastic Repeated Game The set of all MTs The type of an MTk Arrival time of an MTk Departure time of an MTk A coalition Data rate on the SIMO uplink the multicast bitrate on D2D link of Gi Data rate on Virtual MIMO Uplinkt Consumed energy by MTk on SIMO uplink at time slot t Consumed energy by MTk on MIMO uplink at time slot t Payoff of MTk out of OCVM Payoff of MTk in the OCVM Net achieved benefit of MTk at time slot t Payment of MTk at time slot t Base station mechanism Fairness coefficient The grim-trigger punishment mechanism Suspicion of MTk about the presence of M T j at time slot t Minimum probability of presence

φmi αm βm n∗

Shapley value of MTm at time slot t The preferences of MTk for the total consumed energy The preferences of MTk for the average bit rate The last scheduled source in each stage of FDRG

k,t EMIMO

Fig. 2. An illustration of Finitely and stochastically repeated game (FSRG) during the time slots.

the cellular link, all other MTs transmit the source data to the BS to form a virtual MIMO channel. For simultaneously transmission of source data, all MTs in the coalition could use distributed space time codes which is not our main focus in this paper. Note that if there is not any coalition, then the MTs individually transmit their data to the BS on their own cellular link, and the 1 × ARx SIMO uplink will be formed for each of them. We present the list of symbols illustrating parameters of our model in Table 1. Fig. 1 shows an illustrative example of the OCVM formation. In Fig. 1, MT1 , MT2 and MT3 can communicate over the D2D links. If they have enough incentive to relay each other data at time slot t, one multi-member coalition of {M T1 , M T2 , M T3 } will be formed. Fig. 1 shows the moment during it the MT2 is selected as the source and MT1 and MT2 are relaying its data. Because of the mobility of MTs, it is possible that a new MT arrives to the OCVM coalition such as MT8 which is depicted in Fig. 1. In addition, an MT may depart the coalition like MT7 in Fig. 1. Fig. 2 shows the revealed types of MTs in Fig. 1 scenario. As one can see, during a given time slot i, all of three MTs are present in the coalition. The MT2 will depart the coalition in the time slot i + 1. Since MT1 and MT3 do not know the type of MT2 , if they believe that MT2 is present in next time slot, they may relay MT2 ’s data at the current time slot, i. As depicted in Fig. 3, at time slot t = i, all of the MTs in the OCVM coalition, are selected as the source for a specific duration. Therefore, our proposed OCVM protocol should allocate θ1 T , θ2 T and θ3 T to MT1 , MT2 , and MT3 ,

ukSIMO ukMIMO qkt ykt χtBS vk (t) P GT ptkj ¯

G

respectively. Note that θ1 + θ2 + θ3 = 1 and all of the MT1 , MT2 , and MT3 are selected as the source during T . Whenever, an MT is selected as the source, other members can whether cooperate in or defect from relaying its data. Let us assume that each of MTs wants to participate in OCVM coalition to increase data rate and to decrease battery consumption, simultaneously. We will formulate the data rate and battery consumption of MTs and their payoff function in our model as follows. 2.1

Data Rate Calculation

In our OCVM model, there are three different links: the SIMO uplink, the virtual MIMO uplink, and the multicast D2D links. The SIMO uplink is the link between the singlemember coalition, one MT, and BS. The channel gain over cellular link between the MTk and BS is considered as a k,z SIMO systems represented by a vector H SIMO = [hki,z ]ARx ×1

1536-1233 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2017.2707540, IEEE Transactions on Mobile Computing 4

for each subcarrier z . hki,z is the channel gain over subcarrier z between the antenna of the MTk and the ith antenna of BS. It is given by [12], [13]:

hki,z =

z ξk κ 10 az , 10 k dνkb

(1)

where the first factor captures propagation loss, κ is a pathloss constant which depends on the antenna characteristics and the wireless environment, dkb is the distance between MTk and the BS, and ν is the path loss exponent. It is assumed that the antennas of BS are close enough that have almost the same distance from MTk antenna. Thus, the path loss remains the same for different subcarriers. The second factor captures log-normal shadowing, ξkz , of MTk over subcarrier z , and the last factor corresponds to Rayleigh fading of MTk over subcarrier z (generally considered with a Rayleigh parameter , azk , such that E[azk 2 ] = 1). Assuming the SC-FDMA in the uplink, the bitrate on the SIMO uplink between MTk and the BS over subcarrier z is given by [12]: k,z T PTx k,z H k,z SIMO H SIMO ], 2 σ + Iint z∈X(k) (2) where Bz is the bandwidth of each subcarrier which equals B X(k)Isub , and X(k) is the number of RBs that the BS k,z assigns to the MTk on the cellular link. The PTx which is allocated by power allocation mechanism at the BS, is the transmission power of MTk on the sub-carrier z . We consider that the BS uses the same sub-channels for both of the D2D and cellular communications, which is called an underlaid D2D communication [5]. Thus, Iint is equal P j,z j,z T j,z to j6=k,z∈X(k) PTx H SIMO H SIMO which is the interference from D2D links on the subcarrier z of the cellular link. Multi-User MIMO (MU-MIMO) can leverage multiple MTs as spatially distributed transmission resources [14]. In the MU-MIMO MAC, multiple MTs can transmit data to the BS on the same sub-channel and at same time slot, but with different antenna index [15]. The SIMO uplink with MUMIMO transmission is the link between one MT and BS when the MU-MIMO transmission is used in the MAC layer. In the SIMO uplink with MU-MIMO transmission, each MT has more than one transmit antenna but only one transmit RF chain. Each MT selects one of the transmit antennas, and transmits a symbol from a modulation alphabet on the selected antenna [15]. It is possible to increase the number of sub-channels assigned to each MTs under MU-MIMO transmission because the resources are spatially divided among MTs. Thus, the bit rate of the SIMO uplink with MUMIMO transmission can be more than the bit rate of SIMO uplink. The virtual MIMO uplink is the link between the multimember coalition and BS. At time slot t, the channel gain over cellular link between MTk selected as a source of coalition Gi and the BS is considered as a virtual MIMO system. Gi ,z This is represented by a matrix H MIMO = [hki,z ]ARx ×AT x for k subcarrier z , where hi,z is the channel gain over subcarrier z between the antenna of the MTk and the ith antenna m,z Gi ,z of BS. H SIMO given in (1) is the mth column of H MIMO . m,z H SIMO is the channel gain vector over subcarrier z between the antenna of MTm in Gi and ARx antennas of BS. In k RSIMO =

X

Bz log2 det[1 +

this case, although the channel is unknown for each MT as the transmitter, the BS knows the channel information by using pilots. Thus, the BS could optimize the bitrate on the virtual MIMO uplink of each coalition with applying optimum power allocation such as water-filling method. The optimum bitrate at time slot t is given by [12], [13]: X Gi RMIMO = Bz × z∈X(Gi )

max log det IARx + 2 z W

z PTx Gi ,z T H Gi ,z W z H MIMO 2 σ + Iint MIMO

, (3)

where X(Gi ) is the number of resource blocks that BS assigns to the coalition Gi . The weight matrix W z = z [wij ]AT x ×AT x is a diagonal matrix in which each element of z diameter, i.e., ∀m : wmm ≤ 1, indicates the optimum weight of cellular link transmission power from mth member of z is the maximum transmission of Gi on subcarrier z . PTx power of each MT on the subcarrier z . If the BS does not control the power of MTs, each element of weight matrix diameter will be one, i.e, uniform power allocation [12]. We assume an operator controlled link establishment for device-tier communications [16], where the BS handles resource allocation (channel assignment and power allocation) for the D2D links in the D2D tier. Iint is equal to P j,z T j,z j,z j6=Gi ,z∈X(Gi ) PTx H SIMO H SIMO . Following operator controlled link establishment, we assume that the BS allocates the power of D2D transmitter to restrict the level of interference. This assumption maps to a conservative strategy in which the MTs in each coalition assume the worst interference under BS power allocation from other MTs. Thus, the data rate on the virtual MIMO Uplink of each coalition does not depend on transmission power of other coalitions in the restricted level of interference. This assumption is in line with existing works such as the jamming games in [17], in which it is assumed that all other players jam a coalition in a cognitive radio network. The virtual MIMO uplink with MU-MIMO transmission is the link between the multimember coalition and BS when the MU-MIMO transmission is used in the MAC layer. In virtual MIMO uplink with MUMIMO, since all MTs in the same coalition transmit the data of source, all MTs can select same symbol from a modulation alphabet for spatial modulation [15]. Since the resources are spatially divided among coalitions, it is possible to increase the number of sub-channels assigned to each coalition under MU-MIMO transmission. Consequently, the bit rate of the virtual MIMO uplink with MU-MIMO transmission can be more than the bit rate of virtual MIMO uplink. The multicast D2D links is the D2D links between members of the multi-member coalition. The source of each coalition multicasts its data on the multicast D2D links to the members. The bit rate between two MTs k and j on the D2D link is given by [12], [13]:

RD,kj =

X z∈X(Q)

Bz log2 (1 +

k,z z PTx hkj Q Iint + σ2

),

(4)

where X(Q) is the number of RBs that the BS assigns to Q the D2D tier. Iint is the interference that MTj experiences on each sub-carrier z on the D2D link, and it is equal to

1536-1233 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2017.2707540, IEEE Transactions on Mobile Computing 5 i,z i6=k PTx hij . To calculate hkj , we follow the as (1) with dkj , the distance between MTk and

P

same model MTj . In OCVM protocol, the one-hop relay process in the D2D network has two phases [18]: broadcast transmission and cooperative transmission. In the broadcast transmission phase, the source in Gi multicats its data to other MTs in the coalition. Then in the cooperative transmission, multiple MTs simultaneously transmit the received data of the source to the BS over cellular links. This cooperative relaying is called broadcast-then-cooperate protocol [18]. In the broadcast transmission phase, bitrate on D2D link will be limited to the worst channel state between the MTs in Gi . Because the selected source should multicast its data to other MTs on D2D links. Consequently, the multicast bitrate on D2D link of Gi is given by:

RD,Gi =

min

∀(i,j)∈Gi

RD,ij .

(5)

k It is essential that ∀k ∈ Gi : RD,Gi ≥ RSIMO . This one hop rate condition is essential for Gi in OCVM, because it guarantees that all usual members could relay source data without buffering its data or decreasing its data rate.

2.2

Energy Consumption Calculation

According to proposed models in [19] and [20], the power consumption of connected mobile terminals to a LTE cellular network is affected by the transmit (Tx) and receive (Rx) power levels as well as the modulation and coding scheme, and data rate. The cellular power consumption of an MT is mainly divided into two functional blocks [19]: Tx and Rx base band (BB) and radio frequency (RF). The former defines the power consumption as a function of the bit rates, while the latter is affected by the Tx or Rx power levels of RF subsystem. Generally, the power consumption of an MT is given by [19]:

P = Pon + mRx × (PRx + PRxBB (RRx ) + RRxRF (PRx ))+ mTx (PTx + PTxBB (RTx ) + RTxRF (PTx )), (6) where mRx and mTx are binary variables describing whether the MT is receiving or transmitting. The constants Pon , PRx and PTx are the consumed power of the cellular subsystem, the receiver , and the transmitter, respectively. PRxBB , PTxBB , PRxRF , and PTxRF are liner functions in rate and power given in the Table 4 in [19]. In the following, we will calculate energy consumption of MTs for different modes. The total consumed energy of MTk in SIMO transmisk,t sion mode at each time slot, ESIMO , depends on the summation of transmitted power on each subcarrier. This is related to the bitrate on the SIMO cellular link and the time duration of transmission. Thus, at each time slot t, we have: X k,t ESIMO = PTz x,C T, (7) z∈X(k)

where PTz x,C is calculated by (6) for mRx = 0 and mTx = 1, k RTx = RSIMO is the bit rate on the cellular link, and k,z PT x = PT x is the transmission power level of MTk on the subcarrier z . The total consumed energy of MTk to form a virtual k,t MIMO uplink at each time slot, EMIMO , is a function of the consumed energy in the multicast D2D links and cellular

link. Both of them are related to the duration in which MTk acts as a source and usual member. Assume that the M T k is selected as the source for θkt T and as the usual member for γkt T at time slot t (for all k we have γkt + θkt 6 1). Thus, the total consumed energy of MTk at each time slot t, can be calculated as follow: X X k,t EMIMO =( PTz x,C + PTz x,D )θkt T + z∈X(Gi )

(

z∈X(Q)

X

PTz x,C

z∈X(Gi )

+

X

z PRx,D )γkt T.

(8)

z∈X(Q)

The first term indicates the amount of consumed energy when MTk acts as the source during θkt T . In fact, it captures the transmission energy on both cellular and D2D links. The second term indicates the amount of consumed energy of MTk when it relays source data during γkt T . PTz x,C is given k,z by (6) for mRx = 0, mTx = 1, and PT x = PT x , RTx is the k bit rate on the virtual MIMO cellular link RMIMO . PTz x,D is k,z given by (6) when mRx = 0, mTx = 1 and PT x = PT x , RTx z is the transmission rate on multicast D2D link RD,Gi . PRx,D is given by (6) when mRx = 1, mTx = 0, RRx = RD,Gi , and PRx is the consumed power of MTk on the subcarrier z to k,z receive data PRx . 2.3 Payoff of MTs Let assume that each MT has two main preferences: 1) the amount of its data that is transmitted to BS and 2) the total energy that it consumes for transmitting the data. We use the following liner function which is also applied in [9], to represent the data and power consumption metrics in the single value. Let the βk and αk represent the preferences of MTk for the size of transmitted data and the amount of consumed energy, respectively. Moreover, these parameters are used to adjust the scales and units of the data size (in bit) and the battery consumption (in joule) to form a single-value payoff function. As a result, if the MTk does not participate in the OCVM coalition, its total payoff is determined as follow:

ukSIMO = βk

dk X

k,t k RSIMO T − αk ESIMO .

(9)

t=ak

Let θk T and γk T be respectively the time of being source and relay for MTk during T . In order to calculate the total payoff of MTk in the OCVM coalition, we define two concepts: net achieved benefit and payment. Definition 1. At each time slot, the net achieved benefit that the other members of OCVM deliver to the MTk is given by: X X Gi qkt = (βk RMIMO − αk ( PTz x,C + PTz x,D ))θkt T. z∈X(Gi )

z∈X(Q)

(10) It represents the total data that MTk transmits with the help of other members minus the total battery consumption during the time in which it selected as the source. Definition 2. At each time slot, the cost that MTk should pay as a payment to the other members of OCVM is given by: X X z ykt = αk ( PTz x,C + PRx,D )γkt T. (11) z∈X(Gi )

z∈X(Q)

1536-1233 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2017.2707540, IEEE Transactions on Mobile Computing 6

It indeed indicates the total energy consumed by the MTk to relay other sources.

uk,t MIMO

At each time slot t, the payoff is calculated by = k,t Gi βk RMIMO θk T − αk EMIMO . However, according to (10) and k,t (11), it could be rewritten as uMIMO = qkt − ykt . Therefore, the total payoff of MTk is given by: dk X

ukMIMO =

qkt − ykt .

(12)

t=ak

At each time slot t, the MTk needs to consume some energy for relaying other members data, meanwhile other members of OCVM coalition also increase the uplink rate of the MTk with relaying its data. In other words, the consumed energy by MTk for relaying other members data is a payment that has to be given to the other members. Then relying data of the MTk by other members will be the net achieved benefit for the MTk . 2.4

Optimization OCVM formation by altruistic MTs

In this subsection, we present an optimization formulation for the online collaborative virtual MIMO. When the MTs are altruistic, the online collaborative virtual MIMO protocol can be done in the centralized approach where the global network information is available at a central entity such as BS. The BS divides the MTs in different groups and indicates the duration time of relaying and source for each MTs in i each group. Let M = [mG k ]K,|G(t)| in which the binary Gi variable mk is 1 if MTk is in the group Gi otherwise i mG k = 0. The optimization formulation for the centralized approach is given as:

max

t ,γ t ):∀k M,(θk k

XX Gi

i mG k

k

dk X

Gi RSIMO θkt T,

(13)

ak

k RMIMO ≤ RD,Gi , ∀k ∈ Gi , ∀Gi , X θkt = 1 , ∀t, ∀Gi ,

(14) (15)

k∈Gi

0 ≤ θkt ≤ 1 , ∀t, ∀k,

(16)

k,t k EMIMO ≤ Emax , ∀k ∈ K, Gi mk ∈ {0, 1} , ∀Gi , ∀k, X G mk i = 1 , ∀k. Gi

(17) (18) (19)

The objective is maximizing the summation of bit rate of MTs which is given in (13). The constraint (14) guarantees the one-hop rate condition. (15) indicates that the summation of time of source MTs must be equal to T during one time slot. (16) shows that θk can be zero which means that an altruistic MT may relay others data without being source. (17) is the energy constraint of each MTs. (18) and (19) guarantee that each MTk belongs to only one group Gi . In optimization problem (13), the variables are the binary i variable mG k and the duration time of source and relay for each MT in each group at each time slot (θkt , γkt ). Clearly, the optimization problem in (13) is mixed integer programming which can be solved by potentially exhaustive search or branch and bound algorithms [21]. We will compare the optimum solution of (13) with our proposed distributed OCVM protocol in simulation section.

3

M ECHANISM D ESIGN AND G AME A NALYSIS

In each formed virtual MIMO, there is a competition between selfish MTs to be the source. Moreover, if they know that a member leaves the group due to mobility, they will not relay its data any more. In the following, at first, we design an online mechanism for BS that without any priorknowledge of mobility patterns of MTs enforces MTs to reveal the true type, arrival and departure time slots, in the coalitions. Then, we propose the self-punishment mechanism to enforce all-C strategy profile as an NE under uncertainty about the duration of presence of other MTs. Then, we apply distributed coalition formation algorithm to find out how MTs form OCVM coalitions under proposed selfpunishment mechanism and designed online mechanism. 3.1

The Direct-Revelation Online Mechanism

In our proposed OCVM protocol, at the first step the BS as a trusted mediator indicates time slot duration T and asks the arriving MT to reveal its private type through the secure channel. Then, according to the BS mechanism, in each time slot t, all of MTs are selected one by one as a source and others cooperate with the source to increase its data rate. Although for MTk being source is more beneficial than relaying the data of other sources at each time slot, the uncertainty about the duration of presence of other members at next time slots can enforce MTk to relay their data. We focus on designing the family of direct-revelation online mechanisms which restricts the message to a single, direct claim about type of MT that each MT can send to BS. 3.1.1

Direct-Revelation online Mechanism for OCVM

Our main framework to design the online mechanisms is based on ”model-free” and adopts a worst-case analysis [22]. The ”model-free” means that the BS does not have any prior information about future agent types or about feasible decisions in future periods. For each coalition Gi , a mechanism should enforce a sequence of decisions χi = {χ1i , χ2i , ..., }, with decision χti made in period t. The χti should depend only on the MTs in coalition, ∀k ∈ Gi , and their types, τk , at the time slot t. Definition 3. A direct revelation online mechanism [22] for each coalition Gi at time slot t, χti : [τk ]|Gi |×2 → (θt = [θkt ]|Gi |×1 , γ t = [γkt ]|Gi |×1 ), restricts each MTk ∈ Gi to make a single claim about its type τk = (ak , dk ), and defines the net achieved benefit policy θt and payment policy γ t . Fig. 4 depicts how the the direct revelation online mechanism works according to the above definition. As we can see, at the beginning of each time slot, the mechanism asks the type of new arriving MT and updates the set of MTs in the coalition gi at time slot t. Then, according to the policies (θt , γ t ), the χti indicates the net achieved benefit and a payment for MTs in the Gi . 3.1.2

MTs’ Misreports about its true type

Since all MTs are selfish and do not trust each other, an MT receives no feedback before reporting its type and cannot condition its strategy on the report of another MT. Moreover, it is impossible for an MT to report an earlier arrival than its

1536-1233 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2017.2707540, IEEE Transactions on Mobile Computing 7

for each MT. The linear weights in the objective of optimization (21) belongs to a fairness vector Vi = [vkt ]Gi ×1 indicated by the BS. We will talk about it later. The mechanism χBS applies the payment policy to indicates the duration for relaying of other members for each MTk . After indicating the θkt for each MTk , our proposed mechanism χBS uses the payment policy as follow: ( 1 − θkt k 6= n∗ t γk (θk ) = (25) t k = n∗ , according to TPT γn∗ Fig. 4. The direct revelation online mechanism for the OCVM protocol.

true arrival time slot. Thus, the available misreports to an MTk with true type τk = (ak , dk ) are listed as below: Late-arrival misreport (τˆka+ ): Instead of its true type τk , the MTk misreports τˆk in which ak < a ˆk . Early-departure misreport (τˆkd− ): Instead of its true type τk , the MTk misreports τˆk in which dˆk < dk . Late-departure misreport (τˆkd+ ): Instead of its true type τk , the MTk misreports τˆk in which dk < dˆk . The decisions during the presence of MTk in coalition [a ,d ] Gi , χi k k = {χai k , ..., χdi k }, should enforce it to prevent these kind of misreports about its true type. 3.1.3

Constraints of the BS mechanism

The BS should guarantee that MTk gains more from the cooperation rather than defecting and forming the SIMO antenna during each time slot. Definition 4. The mechanism χBS is said to be individually rational if it satisfies the participation or individual-rationality constraints [22], [23]: k,t uk,t MIMO (χBS (τk , τ−k )) ≥ uSIMO (τk ),, ∀k ∈ K.

(20)

Individual-rationality constraints indicate the minimum and maximum amount of net achieved benefit θkt and payment ykt for each MTk during each time slot t, respectively. 3.1.4

Net Achieved Benefit and Payment Policies

In our OCVM protocol, the goals of the net achieved benefit policy of the χBS is to maximize the summation of all MTs’ net achieved benefit (social welfare) under the fairness criteria. After getting the revealed types of Gi at each time slot t, the net achieved benefit policy of the mechanism χBS tries to indicate the duration of being source (θkt ) for each MTk in Gi as an optimal solution to an optimization problem of the following form:

max

t] θit =[θk |Gi |×1

vkt qkt (θkt )

(21)

0 < θkt , ∀k ∈ Gi X θkt = 1 , ∀t k∈Gi uk,t MIMO (χBS (τk , τ−k ))

(22) (23)

≥ uk,t SIMO (τk ) , ∀k ∈ K

(24)

where the constraints (22) and (23) indicate that all MTs are selected as the source during each time slot. The constraint 24 ensures that the individual-rationality constraints is held

Therefore, each MT should be the relay except the time in which it is selected as the source. However, in order to enforce each MT to not misreport the late departure, we generally decrease the payment of the source in the last stage of FDRG. Theorem 1. Truthful Policy Theorem(TPT): If at each time slot, we reduce the payment of the last scheduled source, n∗ , to the value below the γ ¯nt ∗ , then every MTk does not misreport Late-departure. Proof. This theorem is proved in Appendix B. If we decrease the payment of the M T k just at dk , other MTs find out that the M T k is not present in the next time slots and do not relay its data. Therefore, regardless of departure time slot of each MT, we apply the condition in Theorem 1 for every selected source at last stage of FDRG in every time slot. 3.1.5 Dominant-Strategy Implementation Let τ−k is the set of all other MTs’ types and τˆk is any misreport for MTk . Definition 5. For each coalition Gi , an online mechanism χ is a dominant-strategy implementation [22], [23] given true type τk of MTk if

uk,t ´−k )) ≥ uk,t τk , τ´−k )) MIMO (χBS (τk , τ MIMO (χBS (ˆ ∀k ∈ K, ∀ˆ τk 6= τk , ∀´ τ−k .

(26)

The concept of dominant-strategy implementation means that an MT maximizes its utility by reporting its true type whatever the reports of other MTs are. Theorem 2. For each coalition gi , our proposed online mechanism, χi = {χ1 , χ2 , ..., χt }, is a dominant-strategy implementation. Proof. Without knowing other MTs’ type, the MTk may lie about its type and increase its pay off. Three types of misreports could be happened. If the MTk misreport late-arrival or late-departure (τˆka+ or τˆkd− ), it will punish itself. Since the individual-rationality constraints stated in problem (21) assures that if the MTk participates in OCVM, its utility is more than forming the SIMO link during each time slot. Therefore, if it misreports τˆka+ or τˆkd− , it will lose the gain of participating in the virtual MIMO formation during a ˆk − ak and dk − dˆk . The MTk has enough incentive to misreport τˆkd+ and may gain more payoff. The reason is that the scheduler may not select it in the last stage of FDRG of its actual final time slot of presence. However, by applying TPT in Theorem 1, MTs do not misreport Late-departure.

1536-1233 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2017.2707540, IEEE Transactions on Mobile Computing 8

TABLE 2 The fairness coefficient of MTm and its relation with MTn for all fairness criteria at time slot t in coalition Gi (t). Fairness Types

vm (t)

Egalitarian

1 |Gi (t)| RL,m P j∈G (t) RL,j

Proportional Shapley Value

3.1.6

i G (t) φmi P Gi (t) ) j∈Gi (t) (φj

vm (t) vn (t)

(∀m, n ∈ Gi (t)) 1 RL,m RL,n G

φmi G

φn i

Fairness Coefficient

P The Vi = [vkt ]Gi ×1 is a fairness vector, in which k∈Gi vk= 1. Following optimization problem 21, each element vkt affects on the duration of MTk to be source in the coalition Gi at each time slot. We consider different fairness criteria which are depicted in Table 2 as follow. Egalitarian Fairness (EF): the most simple method is to assign vk equally among users. Proportional Fairness (PF): in practice, the MT experiencing a good channel might not be willing to cooperate with a MT under bad channel gains, because of its preference about bit rate. To account for the channel differences, we use a criterion named proportional fairness (PF), in which the fairness vector is weighted according to the users bitrate on the cellular link. Shapley Value Fairness (SV): the Shapley value determines the power of each player in the formed coalition which is proportional to its marginal collaboration [24] and [25]. Since there is a NTU function in the formed coalition Gi (t), to compute the Shapley value of NTU coalition i game, φ(U Gi , λ) = [φG m ]|Gi |×1 where λ = [λm ]|Gi |×1 , in the first step we compute the Shapley value of λ-transfer P equivi alent transferable utility (TU) game, φ(uλGi ) = m λm φG m. It is defined as a highest sum of λ-weighted payoffs that coalition Gi could achieve by any allocation that is feasible for it [24]. The Shape value of each m is defined in the equivalent TU coalition as follow:

|S|!(|Gi | − |S| − 1)! λ [uS∪{m} − uλS ]. |Gi |! S⊂Gi \{m} (27) Under the assumption of randomly-ordered joining, the Shapley function of each user is its expected marginal contribution when it joins the coalition [23]. In the second step, we must divide each component of φ(uλGi ) by the corresponding component λm , to compute the allocation of payoffs in the original NTU scales that correspond to this allocation of λ-weighted utility [24]. Thus, every component of the NTU shaplay value vector, φ(U Gi , λ), which is [24]: φm (uλGi ) =

X

i φG m =

φm (uλGi ) . λi

(28)

In view of fairness, since the more powerful MT in improving marginal gain of the formed coalition having greater shapely value, the more time duration should be selected as source. Thus, the fairness vector affecting on coefficients duration being source in optimization problem (21) is divided proportionally to the users shapely value.

Fig. 5. Each pair of MTs plays a finitely dynamic relaying game under uncertainty at each stage of finitely stochastically repeated game.

3.2

Distributed Self-punishment Mechanism

Each MTk knows the its duration of presence in the OCVM coalition Gi . Therefore, during its presence, it participates in the finite game in which the players’ duration of presence in the game changes stochastically. For each MT, we call this game finitely stochastically repeated game (FSRG). For example MT3 in Fig. 2 participates in 5-stage FSRG in which the other players (i.e., MT1 and MT2 ) stochastically arrive or leave. During each stage of FSRG, all of the MTs in a coalition are selected as the source for one time and others must relay the selected source. Thus, when MTk is selected as the source, there are |Gi | − 1 decision points. In each of these decision points, one of MTs should decide to cooperate with MTk or not. Considering the interaction between a pair of MTs, each of them sequentially should decide to cooperate or defect from relaying the source data. We call this noncooperative game between each pair of the MTs at each stage of FSRG, the finitely dynamic relaying game (FDRG). |Gi |(|Gi |−1) FDRG games during Generally, there are 2 each time slot. A noncooperative FDRG is uniquely defined by a tuple {k, j}, {C, D} × {C, D}, u , where the set {k, j} is a pair of MTs, {C, D} is a set of strategy available for each k,t j,t k,t MT, and u = (uMIMO , uMIMO ) in which uMIMO is a utility function of M Tk at time slot t. Fig. 5 shows one of the FDRG games between M Tk and M Tj , where at first M Tk is scheduled as the source. We assume that an MT does not know the actions of other MTs. It means that a source can not observe whether a data sent to the other MTs will be relayed to BS or not. Thus, there is an imperfect finite dynamic game between two MTs in each stage of FSRG. At each time slot, MTk does not relay data of the selected source (MTj ) in each stage of FDRG if it has enough suspicion that the MTj will not be present in the next stage of FSRG. Definition 6. MTj uses a grim-trigger punishment mechanism P GT to defect from relaying the data of MTk after any stage of FSRG, if any of MTj or MTk does not relay each other data in the current FDRG. Let ptkj be the probability that MTk assigns to the presence of MTj at the time slot t and let the Gi \{j} indicates the set of presest MTs excepts the MTj at time slot t. Assume that during time slot tD , i.e., tD ∈ {ak , ak + 1, ..., dk }, the

1536-1233 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2017.2707540, IEEE Transactions on Mobile Computing 9

MTj cooperates with MTk , and then the MTk defects from relaying data of the MTj . According to punishment mechanism (P GT ), the MTj will defect when MTk is selected as the source in the following stages of FSRG. Consequently, the payoff of MTk is calculated as follow for the next FSRG stages: k,[t ,dk ]

D uMIMO

(D, P GT ) = (qkt − ykt (γk − θj ))+ dk X

{(qkt (Gi \{j}) − ykt (γk − θj ))ptkj +

t=tD +1 t uk,t MIMO (1 − pkj )}. (29)

Lemma 1. The cooperation strategy is the best response (BR) of MTk in t¯ stage of the FSRG in game with MTj if and only if the MTk ’s beliefs in the presence of MTj at the next time slots are higher than the minimum probability of presence t¯,∗ (MPP). This MPP, pkj , is calculated as follow: ¯

ptkj,∗ = Pd k

ykt¯ (θj )

Gi t=t¯+1 {βk (RMIMO

G \{j}

i − RMIMO )θkt + ykt (θj )}

(30)

for t = t¯ + 1, ..., dk . Proof. This lemma is proofed in Appendix A. According to this lemma, we can present the following theorem. Theorem 3. For every stage of FSRG and every pair of MTs at that time slot of the OCVM coalition, if their beliefs about duration of presence of each other are higher than their minimum probability of presence, then the all-C profile will be the NE profile under uncertainty of all pairs of MTs in the whole stages of FSRG. Proof. In every stage of FSRG, there is a FDRG between each pairs of MTs (MTk and MTj ). If the mutual beliefs of MTk and MTj about presence of the other one are higher than their MPP at every time slot t ∈ [tak , tdk ], then according to Lemma 1, their BRs are cooperation profile. Since this condition is held for every pair, then all of them relay each other data in their FDRG between themselves. Consequently, if this is held in every time slot t, the all-C profile will be NE under uncertainty at every stage of FSRG.

4

T HE P ROPOSED OCVM P ROTOCOL

As depicted in Fig. 6, our proposed OCVM protocol has three phases: distributed neighbour discovery and coalition formation, the direct-revelation online mechanism and scheduling algorithm, and autonomously virtual MIMO formation and self-punishment mechanism. In the first phase of OCVM, the MTs perform distributed neighbour discovery protocol such as one in [26]. After finding their neighbours, each MT can estimate the channel states to its neighbors using methods such as the one proposed in [27], [28]. Then, the MTs autonomously form coalitions. A grand coalition, all MTs forming a single OCVM coalition, may not always form by MTs because k of the one-hop rate condition (RD,Gi ≥ RSIMO ) or the power consumption cost of source to multicast data on D2D links to farther members. Instead, the disjoint coalitions

G(t) = {G1 , G2 , ..., G|G(t)| } can be formed by MTs at each time slot. Thus, the problem is categorized as a coalition formation game [29] and [30]. Since there is noncooperative game between selfish MTs in each formed coalition, we apply non-transferable utility function form to model the OCVM group formation [25] and [30]. The interference from other coalitions’ members on a specific coalition would depend on scheduling decision and transmission power of those coalitions’ members. Thus, the cooperation model in each time slot can be generally mapped to a coalitional game in partition form which can be significantly challenging to solve. Since the interference from other collaborative virtual MIMO group is restricted to maximum value under BS power allocation, we can compute the value of each coalition independently of the other coalition decisions. Considering the maximum value of interference from other coalitions, we model the OCVM protocol with characteristic coalition formation which is the worst case scenario. In each time slot, a coalitional game for OCVM protocol in nontransferable utility characteristic function form is uniquely defined by the pair K, U(Gi ) , where the set K of players is the set of MTs and, the value of a NTU coalition Gi , U(Gi ), is a set of payoff vectors, U(Gi ) ⊆ RGi , where each element uk (Gi ) of any vector u ⊆ U represents a payoff that each MTk in coalition Gi . In our proposed OCVM protocol, if the MTs form coalition Gi , then uk (Gi ) = ukMIMO following (12), and if the MTk does not participate to any coalition then uk ({k}) = ukSIMO following (9). Such as in [25], we apply Merge-and-Split algorithm to find out how the MTs autonomously form coalitions. In the second phase, the BS uses the direct-revelation online mechanism. BS applies the payment (25) and net achieved (21) policies for the members of each Gi at time slot t. Moreover, the BS changes the sequence of being source at beginning of each time slot. The BS scheduling algorithm should try to increase uncertainty about the types among MTs and decrease possibility of deviating from cooperation by MTs. Thus, the BS should consider the following principal in its scheduling algorithm. Principal 1: schedule MTk in the last stage of FDRG at the last time slot which it is present in the OCVM protocol. If it is selected as a source in other stages of FDRG except last one, since it knows it is not present in next time slot t, it will not relay other members data any more. Principal 2: permute MTs uniformly to act as a source at each new time slot (FSRG stage): Principal 1 reveals some information about type of MTs. Since if one them selected as source in last stage of FDRG, it will be suspected by others that is may not be present at the next time slot. Consequently, the BS should changes order of sources at each new time slot to provide some anonymity about MTs’ types. In consequence, for stable cooperation NE under uncertainty by selfish mobile, we construct the scheduling algorithm shown in Algorithm 1. At the third phase, all members of each coalition cooperatively form virtual MIMO and transmit the selected source’s data to the BS. Since the BS knows the bit rate of virtual MIMO uplink, it will find out the action of each MT in the given time. Therefore, in our proposed model, the BS acts as a monitoring entity and reveals the actions of the MTs at the end of each time slot. Thus, the members of

1536-1233 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2017.2707540, IEEE Transactions on Mobile Computing 10

Algorithm 1 Scheduling members under uncertainty in the formed virtual MIMO (Sc-VMIMO) Initial State.: 1. Autonomously formed coalitions: G(t) = {G1 , G2 ..., G|G(t)| }. Scheduling phase. Scheduling algorithm guaranteeing cooperation in each formed coalition: 1. Uniformly permutation: for each formed coalition, Gi (t), schedule members according to random permutation. If one of MTs leaves coalition at the end of current time slot, schedule it for the last stage of FDRG. 2. Set the time of being source and relay according to θ and γ vectors calculated in (21) and (25), respectively, for each coalition. Phase I

Autonomously Neighbor Discovery and Distributed Coalition Formation by MTs

OCVM protocol The Direct-Revelation Online Mechanism by BS:

Âi = fμi ; °i g

Scheduling algorithm by BS: BS indicates the sequence of source of each MT in each formed coalition (Sc-VMIMO).

Phase I

Phased II

The BS asks the types (arrival and departure) of new incoming MTs. BS indicates the payment (duration of relaying) and net achieved benefit (duration of being source) for each MTs in each coalition. t t t

Autonomously Neighbor Discovery and Distributed Coalition Formation by MTs Because of mobility, punishment or online mechanism , the MTs autonomously form disjoint coalitions following Merge-andSplit algorithm [30] .

oscillators in frequency and phase to communicate on D2D link with each other. Since in each coalition, each member is selected as a source, the number of D2DSS signals per coalition is equal to the number of members in each coalition which is |G(t)|. The complexity of the proposed OCVM protocol pertains mainly to the complexity of optimization problem of the direct-revelation online mechanism. Following algorithms based on interior point methods to solve LP problems [32], the complexity of BS mechanism is O(|Gi |), where |Gi | is the maximum number of optimization variables, i.e., θk . The neighbor discovery, and merge and split algorithm do not affect the complexity of proposed mechanism, because this is autonomously performed by MTs and there is no need for BS to calculate the group formation of MTs. At each time slot t, the complexity of merge operations is O(|Gi |2 ); and for any group Gi ⊂ Gi , the number of split operations is given by the Bell number which grows exponentially with the size of Gi [25]. In practice, the complexity of split operations diminishes. First of all the split operation is restricted to finding all possible partitions of each formed coalition not a grand coalition and the size of each coalition in the proposed OCVM protocol are relatively small due to one hop rate condition of D2D links. Moreover, an OCVM coalition does not need to go through all the possible partitions. In fact as soon as a coalition finds a split form verifying the Pareto order, the MTs in this coalition will split.

G(t) = fG1; :::; GjG(t)jg

V-MIMO formation by MTs: In each coalition the MTs transmit the data of the source on the formed V-MIMO uplink.

Phased III

Do all MTs relay in each coalition?

NO

t=t+1

Self-punishment: P GT The malicious MT will be punished by the member of its coalition during next time slots.

Yes

Fig. 6. The proposed OCVM protocol.

each coalition can autonomously punish malicious MTs. It is important to highlight that the coalition formation and the OCVM protocol are coupled. This is because MTs can not form the coalitions without knowing the net achieved benefit and payment policies, and at the same time the policies in direct revelation mechanism in each coalition are depend on coalitions’ members. Therefore, in order to capture the effect of coalition formation and policies in OCVM protocol on each other, we sequentially repeat them during time. In conformity to all of discussion in this subsection, a formal block diagram of the proposed OCVM protocol and the sequence of its procedures are depicted in Fig. 6.

5

S IGNALING AND C OMPLEXITY OF THE P RO OCVM

POSED

In our proposed OCVM protocol, we consider a fairly generic model for achieving synchronization of distributed virtual MIMO and D2D communication such as in [31]. Where the source of each coalition can broadcast the D2D synchronization signal (D2DSS) as beacon to other members of the coalitions, then other MTs synchronize their

6

S IMULATION R ESULTS AND A NALYSIS

For the simulation, consider a 400m×400m square area with the BS located at the center of that area. In all the simulations, we consider cluster process for the possible OCVM coalitions where the locations of cluster centers are a realization of a Poisson point process and MTs are normally scattered around them [33]. To capture the arrival and departure of MTs in the OCVM coalitions, we use two models: A and B . In model A, the number of incoming MTs to the OCVM coalition at the beginning of each time slots follows a Poisson distribution with a rate λ, and the number of time slots that each MT is presence in OCVM coalition follows a geometric distribution with the parameter µ = 0.3. In model B , the MTs select their arrival time slots in the OCVM coalition with a uniform distribution, and the number of time slots in which the MT is presents in OCVM coalition also follows a uniform distribution. The BS has four receiving antennas and each MT is equipped with one antenna. We consider 5 MHz for the cellular bandwidth, divided into 25 RB including 12 subcarriers [34]. The BS assigns one RB to each OCVM coalition. We assume that the BS allocates resources in a manner that the maximum interference is restricted to 0.01% of the received power. The channel parameters are κ = −128.1 dB, ν = 3.71, σξ (dB) = 8 dB which are obtained from [34], [35]. Following the power allocation on each subcarrier and bit rate on the cellular or D2D links the PTxBB , PRxBB , PTxRF and PRxRF can be given by table 4 in [19]. The active power of cellular subsystem Pon is 25.1 mWatt [19]. The thermal noise is considered 10−12 mWatt. Also, in all analysis, we repeat the simulation to get interval level of approximately 95 % and confidence interval of half of one unit.

1536-1233 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2017.2707540, IEEE Transactions on Mobile Computing 11 10

Energy efficiency [Mb/Joule]

Energy efficiency [Mb/Joule]

9

8

7

6

Moral Malicious OCVM Opt SIMO

5

4 4

6

8

10

12

14

16

18

9 8 7 6 5 4 1

20

Percentage of malicious MTs

Fig. 7. Energy efficiency per MTs vs. the percentage of malicious MTs.

6.2

Effects of Number of MTs

To capture the effect of the arrival and departure MTs, the battery and rate preferences are considered α = 0.5 and β = 0.5, respectively. The average duration of presence is a geometric random variable with mean 2 for Model A, and in model B the MTs select the arrival time slots from 0 to 100 with a uniform distribution. Fig. 8 and 9 show the impact of the arrival rate of MTs on the energy efficiency of MTs. Fig. 8 shows the result for SIMO and OCVM protocols without MU-MIMO transmission, while Fig. 9 shows the result for SIMO and OCVM protocol with MU-MIMO transmission. From Fig. 8 and 9, we can see that the energy efficiency per user increases for all fairness criteria in the OCVM protocol when the arrival rate of MTs increases in model A or the duration of presence of MTs increases in model B. The SV fairness has the most increase in the energy efficiency among all other fairness criteria. The trends of the plots are similar for

2

2.5

3

3.5

4

Fig. 8. Energy efficiency per MTs vs. average incoming rate in model A or maximum duration of presence in model B.

Effects of Malicious MTs

Here, we capture the robustness of our proposed OCVM protocol to the malicious MTs. The malicious MTs do not relay other sources during stages of FSRG. In addition, the malicious MT leaves coalition in its last time slot of presence if it is not the source in the last stage of FDRG. During the simulation, the malicious MTs are randomly selected. Other MTs are moral, i.e., they always cooperate. All parameters are averaged on the different fairness criteria for OCVM protocol. We consider the average incoming rate is 3 MTs per time slot and the average duration of presence is 4 time slots according to Model A. The battery and rate preferences are considered α = 0.5 and β = 0.5, respectively. As we can see in Fig. 7, the energy efficiency of malicious MTs becomes much higher than the proposed OCVM protocol. The reason is that the moral MTs relay data of malicious ones and malicious MTs do not consume any energy for relaying moral sources. Consequently the energy efficiency of moral MTs becomes lower than the proposed OCVM protocol because malicious MTs do not relay their data. By increasing the percentage of malicious MTs, the energy efficiency of both malicious and moral MTs decreases. Since we use our proposed Grim-trigger punishment and truthful policy under our proposed OCVM protocol, the malicious MTs will cooperate in relaying and will truthfully declare their duration of presence in the OCVM group. Thus, they can not increase their energy efficiency under proposed OCVM protocol.

1.5

Average incoming rate (duration of presence) for model A (B)

26

Energy efficiency [Mb/Joule]

6.1

EF:A PF:A SV:A EF:B PF:B SV:B Opt SIMO

EF & MU-MIMO: A PF & MU-MIMO: A SV & MU-MIMO: A EF & MU-MIMO: B PF & MU-MIMO: B SV & MU-MIMO: B Opt & MU-MIMO SIMO & MU-MIMO

24 22 20 18 16 14 12

1

1.5

2

2.5

3

3.5

4

Average incoming rate (duration of presence) for model A (B)

Fig. 9. Energy efficiency per MTs vs. average incoming rate in model A or maximum duration of presence in model B with MU-MIMO MAC.

both of the models A and B. In Fig. 8, when the incoming rate is 4 MTs in the model A or the maximum duration of presence is 4 time slots in the model B, the EF, PF, and SV fairness in the OCVM protocol respectively boost the energy efficiency up to 26%, 50%, and 72% compared to SIMO. On the average, the proposed OCVM protocol can improves the energy efficiency up to 35% compared to a SIMO uplink. In Fig. 9, when the incoming rate is 4 MTs in the model A or the maximum duration of presence is 4 time slots in the model B, the EF, PF, and SV fairness in the OCVM protocol with MU-MIMO respectively boost the energy efficiency up to 30%, 58%, and 80% compared to SIMO with MU-MIMO. On the average, the proposed OCVM protocol with MU-MIMO transmission can improves the energy efficiency up to 42% compared to a SIMO uplink with MU-MIMO transmission. The reason is that the number of transmitter antennas in virtual MIMO per coalition increases with number of MTs. Consequently, the rate on virtual MIMO link per coalition and energy efficiency go up. After forming cooperative groups and scheduling sources under the proposed OCVM protocol, if the multi-user MIMO transmission is used in the MAC layer, the resource blocks can be spatially shared among cooperative groups. Consequently the OCVM protocol with MU-MIMO transmission can increase the energy efficiency over SIMO, OCVM protocol, and SIMO with MUMIMO transmission. Fig. 10 shows the impact of the arrival rate or duration of presence of MTs on the number of D2DSS per coalition. From Fig. 10, we can see that the number of D2DSS per coalition increases for all fairness criteria in the OCVM protocol and optimum solution. The trend of number of D2DSS per coalition is almost same for all fairness under both A and B models. By increasing the number of MTs, the coalitions size increases. Thus, the number of D2DSS per coalition to

1536-1233 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2017.2707540, IEEE Transactions on Mobile Computing 12

7 6 5

0.35

EF:A PF:A SV:A EF:B PF:B SV:B Opt

Minimum probability of presence

Number of D2DSS per coalition

8

4 3 2 1 1

1.5

2

2.5

3

3.5

0.3

0.25

0.2

0.15

0.1 1

4

Average incoming rate (duration of presence) for model A (B)

1.4 1.2

Maximum payment

Price of anarchy

0.9 0.8

0.5 0.4 1

2

2.5

3

3.5

4

Fig. 12. Minimum probability of presence vs. average incoming rate in model A or maximum duration of presence in model B.

1

0.6

1.5

Average incoming rate (duration of presence) for model A (B)

Fig. 10. Number of D2DSS per coalition vs. average incoming rate in model A or maximum duration of presence in model B.

0.7

EF:A PF:A SV:A EF:B PF:B SV:B

EF:A PF:A SV:A EF:B PF:B SV:B SIMO

1 0.8 0.6

EF:A PF:A SV:A EF:B PF:B SV:B No TPT:A No TPT:B

0.4 0.2

1.5

2

2.5

3

3.5

0 1

4

Fig. 11. Price of anarchy vs. average incoming rate in model A or maximum duration of presence in model B.

synchronize the source with other members also increases. In Fig. 10, on average the proposed OCVM protocol requires around 3 extra signals per coalition in comparison to SIMO uplink. According to the energy efficiency improvement, which is on average around 35% under proposed OCVM protocol and is around 42% under proposed OCVM protocol with MU-MIMO MAC (see Fig. 8 and 9), the signaling overhead in OCMV protocol for communicating required information (see Fig. 10) is acceptable. Fig. 11 shows the impact of the arrival rate or duration of presence of MTs on the price of anarchy in the energy efficiency. The price of anarchy which is the division of energy efficiency under OCVM protocol to the optimum solution for collaborative virtual MIMO formation. From Fig. 11, we can see that the price of anarchy decreases for all fairness criteria in the proposed OCVM protocol when the arrival rate or duration of presence of MTs increases. When the number of MTs increases considering the selfishness of them, the size of coalitions that they form is less than the size of the ones that altruistic MTs form in optimum solution. When the incoming rate is 4 MTs in model A or the maximum duration of presence is 4 time slots in model B, the price of anarchy is the highest one which is 0.78 for model A and 0.74 for model B. Thus, the proposed SV fairness under OCVM protocol can motivate the selfish MTs to be more altruistic in the competitive scenario. Fig. 12 shows the impact of the arrival rate or duration of presence of MTs on the minimum probability of presence. From Fig. 12, we can see that the MPP increases with the arrival rate or duration of presence of MTs. The reason is that the more MTs leads to more number of MTs per coalition. Consequently, the difference between the bit rate of each coalition with and without one of its member, i.e., the dominator of the expression in (30), decreases with the

1.5

2

2.5

3

3.5

4

Average incoming rate (duration of presence) for model A (B)

Average incoming rate (duration of presence) for model A (B)

Fig. 13. Maximum payment vs. average incoming rate in model A or maximum duration of presence in model B.

number of MTs per coalition. Thus, according to Lemma 1, the minimum probability of presence increases. In addition, Fig. 12 shows that the MPP has almost the same value for all fairness under both model A and B . Fig. 13 shows the impact of the arrival rate or duration of presence of MTs on the maximum payment. From Fig. 13, we can see that the maximum payment increases with the arrival rate or duration of presence of MTs. The reason is that the more MTs leads to more energy efficiency. Consequently, the probability that an MT reveals its departure time slot more than its actual value decreases. Then the chance that it is selected in the first steps of FDRG of its last time slot of presence, is also increased. Thus, according to the truthful policy in Theorem 2, we should increase the maximum payment. As we can see, when Truthful Policy Theorem is not applied, the maximum payment of the last scheduled source will be more. But in this case, the truthfulness of MTs is not guaranteed. Moreover, Fig. 13 shows that the maximum payment has almost the same value for all fairness. 6.3

Effects of Battery and Rate Preference of MTs

To capture the effects of MTs preferences on the OCVM protocol, in Fig. 14 and 15, we consider two scenarios, i.e., Scenario I and Scenario III , and compare their results with the Scenario II in which α = 0.5 and β = 0.5. In Scenario I, MTs decrease the battery consumption preference and increase throughput preference, i.e., α = 0.8 and β = 0.2. In Scenario III, MTs increase preference of battery consumption and decrease preference of throughput, i.e., α = 0.2 and β = 0.8. Following Fig. 14, the number of D2DSS per coalition decreases in both Scenarios I and III with respect to Scenario II. When MTs care more about throughput in Scenario I

1536-1233 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2017.2707540, IEEE Transactions on Mobile Computing 13

Number of D2DSS per coalition

7

EF:A PF:A SV:A Opt

6

5

4

3

2

Scenario I

Scenario II

Scenario III

Fig. 14. Number of D2DSS per coalition vs. the battery and rate preferences when the arrival rate is 3 in model A.

Energy efficiency [Mb/Joule]

9 8 7

EF:A PF:A SV:A Opt SIMO

Our simulation results confirm that the proposed OCVM protocol is robust to the rate and battery preference of MTs. Moreover, the simulation results show that if the BS applies the OCVM protocol in which the MTs autonomously punish malicious MTs, the malicious MTs can not gain from cooperation of other moral MTs. In addition, the simulation results show that on average the proposed OCVM protocol improves the energy efficiency up to 35% compared to a SIMO uplink. Moreover, if the multi-user MIMO transmission is used for the uplink medium access layer, the proposed OCVM protocol improves the energy efficiency up to 42% compared to SIMO uplink with MU-MIMO MAC. The proposed OCVM protocol with Shapley value fairness can motivate the selfish MTs to be more altruistic until the price of anarchy reaches to 0.78. Among all fairness criteria, the SV can motivate selfish MTs to be more altruistic in the competitive scenario.

6 5

R EFERENCES

4 3

[1] Scenario I

Scenario II

Scenario III

[2] Fig. 15. Energy efficiency vs. the battery and rate preferences of MTs, when the arrival rate is 3 in model A.

or power consumption in Scenario III, they prefer to be source more time and to relay other members’ data less time. Therefore, they are partitioned in coalitions with less size for both Scenario I and III. This decrease in coalitions’ size means less D2DSS signal per coalition. Following Fig. 15, our proposed OCVM protocol always improves energy efficiency of selfish MTs for different scenarios. Moreover, it tries to keep energy efficiency almost robust to the change of preferences. For more explanations, considering Fig. 14, when MTs care more about throughput in Scenario I or power consumption in Scenario III. This leads to forming coalitions with less size but keep the throughput and battery consumption similar to Scenario II.

7

C ONCLUSION

In this paper, we have proposed the OCVM protocol as a novel autonomously virtual MIMO uplink formation scheme in D2D tier of 5G. Throughout this study, we have been focused on competitive scenario in which the MTs are selfish. Without a-priori knowledge of mobility pattern of MTs, we have applied the uncertainty about the presence of MTs in the coalition to motivate the selfish MTs to form stable coalitions and cooperate in relaying otherss data in each coalition. We have used coalitional game to find out the formed cooperative virtual MIMO groups. Then, we have applied the noncooperative game to design self-punishment mechanism in each coalition. We have also designed an online and truthful mechanism that forces the MTs reveal their duration of presence in the coalitions truthfully to the BS. Our proposed OCVM protocol has three phases: distributed neighbour discovery and coalition formation, direct-revelation online mechanism and scheduling algorithm, and autonomously virtual MIMO formation and selfpunishment mechanism.

[3] [4]

[5]

[6]

[7]

[8] [9] [10] [11] [12] [13]

[14]

[15]

[16]

A. Paulraj, R. Nabar, and D. Gore, Introduction to space-time wireless communications. Cambridge university press, 2003. J. Jiang, M. Dianati, M. Imran, and Y. Chen, “Energy Efficiency and Optimal Power Allocation in Virtual-MIMO Systems,” pp. 1– 6, Quebec, Canada, Sept 2012. S. Desia, I. Toufik, and M. Baker, LTE: the UMTS long term evolution. Wiley Online Library, 2009. M. Tehrani, M. Uysal, and H. Yanikomeroglu, “Device-to-device communication in 5G cellular networks: challenges, solutions, and future directions,” IEEE Communications Mag., vol. 52, no. 5, pp. 86–92, May 2014. A. Asadi, Q. Wang, and V. Mancuso, “A Survey on Device-toDevice Communication in Cellular Networks,” IEEE Communications Surveys Tutorials, vol. 16, no. 4, pp. 1801–1819, Fourthquarter 2014. G. Fodor, E. Dahlman, G. Mildh, S. Parkvall, N. Reider, G. Miklos, and Z. Turanyi, “Design aspects of network assisted device-todevice communications,” IEEE Communications Mag., vol. 50, no. 3, pp. 170–177, March 2012. W. Daad, Z. Han, M. Debbah, and A. Hjørungnes, “A distributed coalition formation framework for fair user cooperation in wireless networks,” IEEE Trans. on Wireless Communications, vol. 8, no. 9, pp. 4580–4593, Deptember 2009. R. A. Ramirez, E. Altman, J. S. Thompson, and R. M. Ramos, “A stable marriage framework for distributed virtual MIMO coalition formation,” pp. 2707–2712, London, United Kingdom, Sept 2013. T. Zhang, W. Chen, Z. Han, and Z. Cao, “Coalitional game theoretic approach for cooperative transmission in vehicular networks,” pp. 6179–6183, Budapest, Hungary, June 2013. Y. Song, G. Su, S. Wang, and Y. Xie, “Group-based user pairing of virtual MIMO for uplink of LTE system,” pp. 346–349, Yichang, China, April 2012. M. Kurras, L. Thiele, and R. Mochaourab, “From single- to multiuser scheduling in LTE-A uplink exploiting virtual MIMO,” pp. 1172–1176, Pacific Grove, USA, Nov 2012. A. Goldsmith, Wireless communications. Cambridge university press, 2005. H. Dai, A. Molisch, and H. Poor, “Downlink capacity of interference-limited MIMO systems with joint detection,” IEEE Trans. on Wireless Communications, vol. 3, no. 2, pp. 442–453, March 2004. C. K. Wen, S. Jin, and K. K. Wong, “On the Sum-Rate of Multiuser MIMO Uplink Channels with Jointly-Correlated Rician Fading,” IEEE Transactions on Communications, vol. 59, no. 10, pp. 2883–2895, Oct, 2011. T. L. Narasimhan, P. Raviteja, and A. Chockalingam, “Large-scale multiuser SM-MIMO versus massive MIMO,” Information Theory and Applications Workshop (ITA), pp. 1–9, San Diego, USA, Feb, 2014. L. Lei, Z. Zhong, C. Lin, and X. Shen, “Operator controlled device-to-device communications in LTE-advanced networks,” IEEE Wireless Communications, vol. 19, no. 3, pp. 96–104, June 2012.

1536-1233 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2017.2707540, IEEE Transactions on Mobile Computing 14

[17] L. Xiao, J. Liu, Q. Li, N. B. Mandayam, and H. V. Poor, “Usercentric view of jamming games in cognitive radio networks,” IEEE Transactions on Information Forensics and Security, vol. 10, no. 12, pp. 2578–2590, Aug, 2015. [18] B. Maham, W. Saad, M. Debbah, Z. Han, and A. Hjørungnes, “Efficient cooperative protocols for general outage-limited multihop wireless networks,” in International Symposium on Personal, Indoor and Mobile Radio Communications. IEEE, Sept, 2010, pp. 145–150. [19] M. Lauridsen, L. No¨el, T. B. Sørensen, and P. Mogensen, “An empirical lte smartphone power model with a view to energy efficiency evolution,” Intel Technology Journal, vol. 18, no. 1, pp. 172–193, Mar, 2014. [20] B. Dusza, C. Ide, L. Cheng, and C. Wietfeld, “Copomo: a contextaware power consumption model for lte user equipment,” Transactions on Emerging Telecommunications Technologies, vol. 24, no. 6, pp. 615–632, Aug, 2013. [21] I. Quesada and I. E. Grossmann, “An LP/NLP based branch and bound algorithm for convex minlp optimization problems,” Computers and chemical engineering, vol. 16, no. 10-11, pp. 937–947, 1992. [22] N. Nisan, T. Roughgarden, E. Tardos, and V. V. Vazirani, Algorithmic game theory. Cambridge University Press, 2007. [23] Y. Dhoham and K. Leyton-Brown, Multiagent systems: Algorithmic, game-theoretic, and logical foundations. Cambridge University Press, 2008. [24] R. B. Myerson, Game theory: analysis of conflict, 1991. [25] W. Daad, “Coalitional game theory for distributed cooperation in next generation wireless networks,” Ph.D. dissertation, University of Oslo, 2010. [26] M. Corson, R. Laroia, J. Li, V. Park, T. Richardson, and G. Tsirtsis, “Toward proximity-aware internetworking,” IEEE Wireless Communications, vol. 17, no. 6, pp. 26–33, December 2010. [27] X. Wu, S. Tavildar, S. Shakkottai, T. Richardson, J. Li, R. Laroia, and A. Jovicic, “FlashLinQ: A synchronous distributed scheduler for peer-to-peer ad hoc networks,” IEEE/ACM Trans. on Networking (TON), vol. 21, no. 4, pp. 1215–1228, Aug, 2013. [28] S. Kim and W. Stark, “Full duplex device to device communication in cellular networks,” Proc. IEEE Int. Conf. Computing, Networking and Communications (ICNC), pp. 721–725, Bologna, Italy, June, 2014. [29] W. Daad, Z. Han, M. Debbah, A. Hjørungnes, and T. Bas¸ar, “Coalitional game theory for communication networks,” IEEE Signal Processing Mag., vol. 26, no. 5, pp. 77–97, Deptember 2009. [30] Z. Han, W. Saad, T. Bas¸ar, and A. Hjørungnes, Game theory in wireless and communication networks: theory, models, and applications. Cambridge University Press, 2012. [31] K. Lee, D. H. Lee, W. Hwang, and H.-J. Choi, “A multi-hop relay based frequency synchronization for D2D communication in 3GPP LTE system,” International Conference on Ubiquitous and Future Networks, pp. 766–771, Sapporo, Japan, July, 2015. [32] J. Gondzio and T. Terlaky, A computational view of interior-point methods for linear programming. Citeseer, 1994. [33] M. Afshang, H. S. Dhillon, and P. H. J. Chong, “Modeling and performance analysis of clustered device-to-device networks,” IEEE Transactions on Wireless Communications, vol. 15, no. 7, pp. 4957–4972, July 2016. [34] 3rd Generation Partnership Project (3GPP), “E3GPP TR 25.814 3GPP TSG RAN Physical Layer Aspects For Evolved UTRA, v7.1.0.” [35] E. Yaacoub, Z. Dawy, and A. Abu-Dayya, “On real-time video streaming over LTE networks with mobile-to-mobile cooperation,” pp. 1–6, Jounieh, Lebanon , April 2012.

Mehdi Naderi Soorki received the B.Sc degree in Electrical Engineering from Iran University of Science and Technology in 2007. He received his M.Sc. degree in Telecommunication Networks from the Isfahan University of Technology in 2010. He is a PhD candidate within the Game Theory & Mechanism Design Research laboratory at the Department of Electrical and Computer Engineering at Isfahan University of Technology, under supervision of Dr. Mohammad Hossein Manshaei and Dr. Hossein Saidi. His research interests include wireless networking, game theory and stochastic optimization.

Mohammad Hossein Manshaei received the BSc degree in electrical engineering and the MSc degree in communication engineering from the Isfahan University of Technology in 1997 and 2000, respectively. He received another MSc degree in computer science and the PhD degree in computer science and distributed systems from the University of Nice Sophia-Antipolis, France, in 2002 and 2005, respectively. He did his thesis work at INRIA, Sophia-Antipolis, France. He is currently an assistant professor at the Isfahan University of Technology, Iran. From 2006 to 2011, he was a senior researcher and lecturer at EPFL, Switzerland. He held visiting positions at the UNCC, the NYU, and the VTech. His research interests include wireless networking, wireless security and privacy, computational biology, and game theory.

B ehrouz Maham (S07, M10, SM15) received the B.Sc. and M.Sc. degrees in electrical engineering from the University of Tehran, Iran, in 2005 and 2007, respectively, and the Ph.D. degree from the University of Oslo in 2010. From September 2008 to August 2009, he was with the Department of Electrical Engineering, Stanford University, USA. He is currently an Assistant Professor with the School of Engineering, Nazarbayev University (NU). He was a faculty with the School of Electrical and Computer Engineering, University of Tehran, from Sep. 2011 to Aug. 2015. Dr. Maham is a senior member of IEEE and has around 120 publications in major technical journals and conferences. His fields of interest include wireless communication and networking and molecular nano-communications.

Hossein Saidi received B.S and M.S. degrees in Electrical Eng. in 1986 and 1989 respectively, both from Isfahan University of Technology (IUT), Isfahan Iran. He also received D.Sc. in Electrical Eng. from Washington University in St. Louis, USA in 1994. Since 1995 he has been with the Dept. of Electrical and Computer Engineering at IUT, where he is currently Full Professor and serves as the Chair of the Electrical and Computer Engineering Department . His research interest includes high speed switches and routers, communication networks, QoS in networks, security, queueing system and information theory. He was the co-founder and R&D director at MinMax Technology Inc. (1996-1998) and Erlang Technology Inc. (1999-2006) both in USA, and SarvNet Telecommunication Inc. (2003). He holds 4 USA and one International patents and has published more than 100 scientific papers. He is the recipient of several awards including: 2006 ASPA award and the certificate award at 2011 National Festival of Information and Communication Technology, both as the CEO of SarvNet Telecommunication Inc., and he also received the 2011 IUT Distinguished Researcher award.

1536-1233 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2017.2707540, IEEE Transactions on Mobile Computing 15

A PPENDIX A In this appendix, we are going to prove the Lemma 1. Following P GT , if the all-C profile is BR, then according to (12) and (29), we have: k,[t ,dk ]

D uMIMO

k,[t ,dk ]

D (D, P GT ) ≤ uMIMO

(all-C).

Then, dk X

(qktD −yktD (γk −θj ))+

(qkt (Gi \{j})−ykt (γk −θj ))ptkj +

t=tD +1 k,t uMIMO (1 − dk X

(qktD −yktD (γk ))+

ptkj ) ≤

k,t t t uk,t MIMO (all-C)pkj +uMIMO (1−pkj ).

Fig. 16. Scheduling of the MTk at the last time slot of its presence under χBS .

t=tD +1

By simplification:

yktD (γk ) − yktD (γk − θj ) ≤ dk X

t t t uk,t MIMO (all-C) − (qk (Gi \{j}) − yk (γk − θj )) pkj .

t=tD +1

Following the merge-and-split algorithm in coalitional fork,t mation game, it should be uMIMO (all-C) ≥ (qkt (Gi \{j}) − t yk (γk − θj )). The reason is that the members of the OCVM coalition accept a new MT in their coalition if it does not decrease the payoff of any other member. Thus, the two side of the inequality is positive; and according to (10) and (11), we have: dk X

yktD (θj ) ≤

Gi \{j} t Gi βk (RMIMO −RMIMO )θk +ykt (θj ) ptkj .

t=tD +1

Since all of the above discussion is reversible, we can say that if the beliefs of MTs satisfy the above equation then the all-C profile is the BR. By the assumption that during each time slot the coalitions’ members do not change, at the end of each time slot the MTk updates its belief about the presence of others in the next time slots. Moreover, if we assume the MTk beliefs are equal, then we have tD +1 pkj

[tak , tdk ]

∀(M Tk , M Tj ) ∈ Gi , ∀tD ∈ : which is updated at the beginning of each consequently:

Gi t=tD +1 βk (RMIMO

−

Gi \{j} t RMIMO )θk

k uk,d τk+d , τ´−k )) = MIMO (χBS (ˆ

I(dk )−1

X

X

n=1

J ∈In

(qkt (θkdk ) − ykt (

X

θidk ))

i∈J

+

ykt (θj )

≤ ptkjD +1

A PPENDIX B In this appendix, we are going to prove truthful policy in Theorem 1. At t = dk , the MTk misreports its type to not be selected as the source in the last stage of FDRG and after that does not relay other sources during remaining FDRG stages. Different potential cases of the MTk in scheduling sequences are depicted in Fig. 16 (a), if it misreports Late-departure. In this figure, θkdk is the duration that MTk gets the net benefit. Before this time, MTk should give payment to other sources (i.e. yk ). Since the χBS uniformly permutes the sequence of the sources at each time slot, the MTk will calculate its expected payoff for all potential cases in the scheduling

(n − 1)!(|I(dk )| − n)! , |I(dk )|!

following TPT, to enforce an MTk not misreport τk+d , we decrease its payment during dk . This is illustrated in Fig. 16 (b). According to this figure, the payment reduction makes the MTk relay other sources for duration less than (1 − θkdk )T . Consequently, the following should be held at time slot t = dk , given the number of MTs I(dk ) and ∀´ τ−k : k k uk,d ´−k )) ≥ uk,d τk+d , τ´−k )). MIMO (χBS (τk , τ MIMO (χBS (ˆ

Therefore, we have

{qkt (θkdk ) − ykt (γkdk )} ≥ I(dk )−1

yktD (θj ) Pdk

td = ... = pkjk time slot tdk ,

sequences. Let In indicates the set of all possible subset of MTs that selected as the sources before nth stage of FDRG, when MTk is selected as the source in nth stage of FDRG. Thus, we have I1 = {}, In = {∀J ⊆ Gi \{k}, |J | = n − 1}. The MTk will calculate its expected payoff for all potential cases in scheduling sequences. The number of cases that MTk is selected as the source in the nth stage of FDRG and a given subset J ∈ In is scheduled before it equal to (n − 1)!(|I(dk )| − n)!. The total number of scheduling sequences is |I(dk )|!. Therefore, following (12), we have:

X

X

n=1

J ∈In

(qkt (θkdk )−ykt (

X

θidk ))

(n − 1)!(|I(dk )| − n)! . |I(dk )|!

θidk ))

(n − 1)!(|I(dk )| − n)! . |I(dk )|!

i∈J

Then,

ykt (γkdk ) ≤ qkt (θkdk )− I(dk )−1

X

X

n=1

J ∈In

(qkt (θkdk )−ykt (

X

i∈J

Thus, according to (11), we have:

qkt (θkdk ) P − z αk ( z∈X(Gi ) PTz x,C + z∈X(Q) PRx,D )T PI(dk )−1 P P dk (n−1)!(|I(dk )|−n)! t dk t n=1 J ∈In (qk (θk ) − yk ( i∈J θi )) |I(dk )|! P P . z αk ( z∈X(Gi ) PTz x,C + z∈X(Q) PRx,D )T max(γkdk ) =

P

1536-1233 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

on multiuser mimo two-way relaying in cellular ...