Efficient group multicast node scheduling schemes in multi-hop wireless networks

Communicating computing: Distributed mobile cloud computing over 5G heterogenous networks Robust beamforming for secure communication in systems with wireless power transfer
Dr.TuvaMansson Profile Pic
Published Date:21-12-2017
Your Website URL(Optional)
1 Joint Offloading and Computing Optimization in Wireless Powered Mobile-Edge Computing Systems Feng Wang, Member, IEEE, Jie Xu, Member, IEEE, Xin Wang, Senior Member, IEEE, and Shuguang Cui, Fellow, IEEE Abstract—Mobile-edge computing (MEC) and wireless power provide real-time machine-to-machine and machine-to-human transfer (WPT) have been recognized as promising techniques interactions 2. These emerging latency-sensitive applications in the Internet of Things (IoT) era to provide massive low- critically rely on the real-time communication and computa- power wireless devices with enhanced computation capability tion of massive wireless devices. For example, smart wireless and sustainable energy supply. In this paper, we propose a sensors in IoT networks may need to perceive the physical unified MEC-WPT design by considering a wireless powered multiuser MEC system, where a multi-antenna access point (AP) environment and then use the built-in computation resources (integrated with an MEC server) broadcasts wireless power to to preprocess the sensed data in real time before sending it to charge multiple users and each user node relies on the harvested theaccesspoint(AP)2.Asextensiveexistingworksfocuson energy to execute computation tasks. With MEC, these users can improving their communication performance, how to provide execute their respective tasks locally by themselves or offload all thesedeviceswithenhancedcomputationcapabilityisacrucial orpartofthemtotheAPbasedonatimedivisionmultipleaccess (TDMA) protocol. Building on the proposed model, we develop yet challenging task to be tackled, especially when they are of an innovative framework to improve the MEC performance, by small size and low power. To resolve this issue, mobile-edge jointly optimizing the energy transmit beamforming at the AP, computing (MEC) has emerged as a promising technique by thecentralprocessingunit(CPU)frequenciesandthenumbersof providingcloud-likecomputingattheedgeofmobilenetworks offloaded bits at the users, as well as the time allocation among via integrated MEC servers at wireless APs and base stations users. Under this framework, we address a practical scenario where latency-limited computation is required. In this case, we (BSs) 3, 4. Leveraging MEC, resource-limited wireless develop an optimalresourceallocation scheme thatminimizesthe devices can offload their computation tasks to APs/BSs; then AP’s total energy consumption subject to the users’ individual the integrated MEC servers can compute these tasks remotely. computation latency constraints. Leveraging the state-of-the-art In general, the computation offloading can be implemented optimizationtechniques,wederive theoptimalsolutioninasemi- in two ways, namely binary and partial offloading 3. In closed form. Numerical results demonstrate the merits of the proposed design over alternative benchmark schemes. the binary offloading case, the computation task is not par- titionable and should be offloaded as a whole. In the partial Index Terms—Mobile-edgecomputing,wirelesspowertransfer, offloading case, the task can be partitioned into two parts, and computation offloading, energy beamforming, convex optimiza- tion. only one of them is offloaded. The MEC technique facilitates the real-time implementation of computation-extensive tasks at massive low-power devices, and thus has attracted growing I. INTRODUCTION research interests in both academia and industry 3–7. The recent advancement of Internet of Things (IoT) has On the other hand, how to provide sustainable and cost- motivated various new applications (e.g., autonomous driv- effective energy supply to massive computation-heavydevices ing, virtual reality, augmented reality, and tele-surgery) to is another challenge facing IoT. Radio-frequency (RF) signal based wireless power transfer (WPT) provides a viable solu- Manuscript received May 26, 2017; revised October 16, 2017; ac- cepted December 12, 2017. This work was in part supported by the tion by deploying dedicated energy transmitters to broadcast National Key Research and Development Program of China grant no. energy wirelessly 18. Recently, emerging wireless powered 2017YFB0403402 and the National Natural Science Foundation of China communicationnetworks (WPCNs) and simultaneouswireless grant no. 61671154, by DoDwith grant HDTRA1-13-1-0029, bygrant NSFC- 61328102/61629101, by Shenzhen Fundamental Research Fund under Grant informationand powertransfer(SWIPT) paradigmshave been No. KQTD2015033114415450, and by NSFwith grants DMS-1622433, AST- proposed to achieve ubiquitous wireless communications in 1547436, ECCS-1508051/1659025, and CNS-1343155. Part of this paper was a self-sustainable way 12–15, where WPT and wireless presented in the IEEE International Conference on Communications, Paris, France, May 21–25, 2017 1. The associate editor coordinating the review of communications are combined into a joint design. In order to this paper and approving it for publication was T. Melodia. (Corresponding improvetheWPTefficiencyfromtheenergytransmittertoone author: Jie Xu.) or more energy receivers, transmit energy beamforming has F. Wang and J. Xu are with the School of Information Engineering, Guangdong University of Technology, Guangzhou, China (e-mail: feng- been proposed as a promising solution by deploying multiple wang13gdut.edu.cn, jiexugdut.edu.cn). antennas at energy transmitters 9. By properly adjusting X.WangiswiththeKeyLaboratory forInformation Science ofElectromag- the transmit beamforming vectors, energy transmitters can netic Waves (MoE), the Shanghai Institute for Advanced Communication and Data Science, the Department of Communication Science and Engineering, concentratethe radiativeenergytowardstheintendedreceivers Fudan University, Shanghai, China (e-mail: xwang11fudan.edu.cn). for efficient WPT. Motivated by these approaches, it is ex- S. Cui is with the Department of Electrical and Computer Engineering, pected that the transmit energy beamforming-enabled WPT University of California, Davis, CA, 95616, USA, and also affiliated with Shenzhen Research Institute of Big Data (e-mail: sgcuiucdavis.edu). can also play an important role in facilitating self-sustainable arXiv:1702.00606v4 cs.IT 18 Dec 20172 computing for a large number of IoT devices. A. Related Works To explore benefits of both MEC and WPT in ubiquitous Transmit energy beamforming enabled WPT has been ex- computing, this paper develops a joint MEC-WPT design tensively studied in the literature (see, e.g., 9–24 and by considering a wireless powered multiuser MEC system references therein). By considering a linear energy harvesting that consists of a multi-antenna AP and multiple single- (EH) model, variousprior works have investigatedthe optimal antenna users. The AP employs energy transmit beamforming design of energy beamforming under different setups with to simultaneously charge the users, and each user relies on SWIPT, e.g., in two-user multi-input multi-output (MIMO) its harvested energy to execute the respective computation systems 9, secrecy communications systems 10, multi- task. Suppose that partial offloading is allowed such that each input single-output (MISO) interference channels 11, and user can arbitrarily partition the computation task into two multiuser MISO downlink channels 14–16. Furthermore, independent parts for local computing and offloading, respec- some recent works investigated the transmit power allocation tively. Furthermore, we assume that the downlink WPT and 23 and the transmit waveform optimization 24 for WPT the uplink wireless communication (for computation offload- by taking into account the nonlinear nature of the rectifier in ing) are operated simultaneously over orthogonal frequency EH 20, 21. In addition, the benefit of energy beamforming 1 bands. In addition, a time division multiple access (TDMA) crucially relies on the channel state information (CSI) known protocol is employed to coordinate computation offloading, at the transmitter. The reverse-link channel training 17 and where different users offload their respective tasks to the AP the energy measurement and feedback methods 18, 19 over orthogonal time slots. The main results of this paper are were proposed in WPT systems for the energy transmitter to summarized as follows. practicallylearn theCSI to users. Furthermore,22developed a distributed energy beamforming system for multiple energy • To improve the performance of such a wireless powered transmitters to charge multiple energy receivers simultane- multiuser MEC system, we develop an innovative design ously, with the help of the energy measurement and feedback. framework by jointly optimizing the energy transmit On the other hand, several existing works 25–33 in- beamformingat theAP, thecentralprocessingunit(CPU) 2 vestigated the energy-efficient multiuser MEC design, where frequencies and the numbers of offloaded bits at the each user is powered by fixed energy sources such as battery, users, as well as the offloading time allocation among and the objective is to minimize the energy consumption at users. Note that the number of offloaded bits at each user the users via joint computing and offloading optimization at corresponds to the multiplication of the offloading rate the demand side. For example, 25 provided an overview and allocated offloading time in this block. on the applications and challenges of computation offloading. • Targeting an energy-efficient wireless-powered MEC de- 26 and 27 investigated the dynamic offloading for MEC sign, we minimize the AP’s total energy consumption systems based on the techniques of Markov decision process subject to the users’ individual computation latency andLyapunovoptimization,respectively.28,29considered constraints. Leveraging the state-of-the-art optimization the joint computation and communication resource allocation techniques, we obtain the optimal solution in a semi- in single-user MEC systems, and such designs were extended closed form. It is revealed that at the optimal solution, to multiuser MEC systems in 30–33. Different from these the number of locally computed bits at each user should prior works that studied WPT and MEC separately, this paper be strictly positive; i.e., it is always beneficial for each pursues a joint MEC-WPT design in a wireless powered user to leave certain bits for local computing. It is also multiuser MEC system, by jointly optimizingthe WPT supply shown that the optimal offloading rate (and equivalent at the AP, as well as the local computing and offloading transmit power) at each user critically depends on the demands at the users. channel power gain and the circuit power. It is worth noting that a prior work 34 considered the • Extensive numerical results are provided to gauge the wireless powered single-user MEC system with binary of- performance of the proposed designs with joint WPT, floading, where the user aims to maximize the probability of local computing, and offloading (i.e., task partition per successful computation, by deciding whether a task should user and offloading time allocation among the users) be fully offloaded or not, subject to the computation latency optimization, over benchmark schemes without such a constraint. By contrast, this paper considers a more general joint consideration. It is shown that the proposed design case with more than one user, and allows for more flexible can significantly reduce the energy consumption of the partial offloading to improve the system performance in terms wireless powered MEC systems. of the energy efficiency (i.e., minimizing the total energy consumptionat the AP includingthe radiated energy for WPT 1 and the energy for computing the offloaded tasks). The wireless energy harvesting in the downlink and the information transmission (or offloading) in the uplink can be performed simultaneously The remainder of the paper is organizedas follows. Section over orthogonal frequency bands in one single antenna with a duplexer, as II presents the system model. Section III formulates the com- commonly used in conventional frequency-division-duplexing (FDD) wireless communication transceivers. putationlatencyconstrainedenergyconsumptionminimization 2 The term CPU generally refers to the processing unit and control unit at problem, and develops an efficient algorithm to obtain a well- each user that takes charge of the local computing of computation tasks. The structured optimal solution. Section IV provides numerical CPU frequency, i.e., the frequency of the CPU’s clock pulses, determines the rate at which a CPU executes instructions. results to demonstrate the merits of the proposed design.… … 3 WPT, the computationoffloading,and the local computingfor the K users. User 1 IoT sensor h 1 g 1 Duplexer input bits A. Energy Transmit Beamforming from AP to Users Information h i RF Offloading transceiver signal N×1 Energy g Let s ∈ C denote the energy-bearing transmit signal i DC Rechargeable Rectifier Task signal battery by the AP, which is assumed to be a random signal with Energy its power spectral density satisfying certain regulations on g Local computing K User i input bits h K AP with an MEC server H RF radiation 18. Let Q , Ess  0 denote the energy integrated 2 transmit covariance matrix and Eksk = tr(Q) the transmit User K power at the AP. In general, the AP can use multiple energy IoT sensor beamstodeliverthewirelessenergy,i.e.,Qcanbeofanyrank. Downlink WPT from the AP to the K users Supposing r = rank(Q)≤ N, then a total of r energy beams Computation offloading based on TDMA can be obtained via the eigenvalue decomposition (EVD) of N×1 Q 18. Let h ∈ C denote the channel vector from i Fig. 1. Awireless powered multiuser MECsystem with WPT inthe downlink H and computation offloading in the uplink. the AP to user i ∈ K, and define H , hh , ∀i ∈ K. i i i Accordingly, the received RF power at each user i ∈K is H 2 given byh s . In order to harvest such energy, each user i Finally, Section V concludes this paper. i first converts the received RF signal into a direct current Notations: Boldface letters refer to vectors (lower case) or (DC) signal by a rectifier and then stores the energy of the matrices (uppercase). For a square matrixS, tr(S) denotes its DC signal in its chargeable battery (cf. Fig. 1). Note that the trace, while S 0 means that S is positive semidefinite. For harvested DC power is generally nonlinear with respect to the † H an arbitrary-size matrix M, rank(M), M , and M denote received RF power 20, due to the nonlinear devices such its rank, transpose, and conjugate transpose, respectively. I as the diodes and diode-connected transistors. Moreover, the and 0 denote an identity matrix and an all-zero vector/matrix, nonlinear RF-to-DC conversion greatly depends on the input x×y respectively, with appropriate dimensions. C denotes the power level and the transmit waveform. In the literature, there space of x× y complex matrices; R denotes the set of real have been a handful of recent works on analytic nonlinear EH numbers.E· denotes the statistical expectation.kxk denotes models, which characterize such nonlinear relations between the Euclidean norm of a vector x,z denotes the magnitude the harvested DC power and the input RF power 23 or + of a complex number z, andx , max(x,0). transmit waveform 24. However, there still lacks a generic EH modelthat capturesall practical issues 21.Therefore,for simplicity, we assume that the input RF power is within the II. SYSTEM MODEL linear regime of the rectifier, and consider a linear EH model As shown in Fig. 1, we consider a wireless powered mul- which has been commonly adopted in the WPT literature 9– tiuser MEC system consisting of an N-antenna AP (integrated 12, 14–22. Accordingly, the harvested energy amount by with an MEC server) and a setK , 1,...,K of single- user i over this time block is antennausers. Inthis system, the AP employsRF signal based h i 2 H energytransmit beamformingto chargethe K users. Each user E =TζE h s =Tζtr(QH), (1) i i i i∈K utilizes the harvested energy to execute its computation where 0 ζ≤ 1 denotes the constant EH efficiency per user. task through local computing and offloading. Suppose that the TheharvestedenergyE is usedbyuseri forbothcomputation i downlinkWPTfromtheAPtotheusersandtheuplinkcompu- offloading and local computing. tation offloading are operated simultaneously over orthogonal frequency bands, and the uplink for computation offloading and the downlink for computation result downloading are B. Energy Consumption at Users for Computation operatedoverthesamefrequencyband.Assumeablock-based For each user i ∈K, the computation task with R 0 i model, and we focus on one particular block with length T. computationinputbits is partitionedinto two parts with ℓ ≥ 0 i Here, T is chosen to be no larger than the latency of the and q ≥ 0 bits, which are offloaded to the MEC server at the i MECapplicationandalsonolargerthanthechannelcoherence 4 AP or locally computed, respectively. We assume that such a time,such thatthe wireless channelsremainunchangedduring partition does not incur additional computation input bits, i.e., this block. For simplifying the analysis and better capturing R = ℓ +q , ∀i∈K. i i i the AP’s transmission energy for computation offloading, we 1) Computation Offloading from Users to the AP: In order assume that the AP perfectly knows the CSI from/to the K for the K users to offload their respective bits to the AP for 3 users, as well as their computation requirements. In accor- computation,we adopt a TDMA protocol without interference dancewith such information,the AP coordinatesthe downlink as shown in Fig. 2, where the block is divided into 2K time slots each with a length of t , ∀i∈1,...,2K. In the first K i 3 Whenthe CSIatthe APisnotperfect (e.g.,subject tosomeCSIestimation time slots, the K usersoffloadtheircomputationbits to the AP errors), the WPT and MEC performance may degrade. In this case, robust optimization techniques (see, e.g., 10, 15) may be applied to obtain the 4 energy beamforming vectors. However, the imperfect CSI scenario is out of Each input bit can be treated as the smallest task unit, which includes the scope of this paper. needed program codes and input parameters.4 T (e.g., for CSI feedback), and there generally exists a tradeoff Downloading between such energy consumption at the users versus the CSI Downloading Offloading from Offloading from … … from the AP from the AP user 1 to the AP user K to the AP accuracyattheAP.However,withthetechnicaladvancements, to user 1 to user K the user’s feedback overhead for CSI acquisition could be t t ≈ 0 t ≈ 0 K+1 2K 1 t K made very small. Specifically, there are generally three types of CSI acquisition methods in the literature, namely the chan- Fig. 2. The TDMA protocol for multiuser computation offloading. nel estimation and feedback 8, reverse-link training based on the channel reciprocity 17, and energy measurement and one by one. After the MEC server executes the computation feedback 18, 19. In the energy measurement and feedback tasks on behalf of these users, the AP sends the computation method 18, each user only needs to measure its harvested results to the K users in the last K time slots. Due to the energy level over each block and send one feedback bit to the sufficient CPU capability at the MEC server, the computation AP per block; based on the feedback bits, the AP can sequen- time consumed at the MEC server are relatively small and tially improve the accuracy of CSI estimation; such a one-bit negligible. Therefore, we assume that the users can download feedback is negligible when compared to the user reverse-link thecomputationresultsimmediatelyafterthefirst K offloading traffic for task offloading. Thus it is practically reasonable to timeslots.Furthermore,astheAPisusuallywithhightransmit ignore the feedback overheadand energyconsumptionat each power and the computed results are usually of small size, we user. ignorethe downloadingtime, i.e., t ≈ 0,∀i∈K+1,...,2K, i As for the AP, the energyis mainly consumedfor executing and also ignore the energy consumption for transmitting and the offloadedcomputationtasks and transmitting the computa- receiving the computation results in this paper. tion results back to the users 4. As the AP and its integrated N×1 For computation offloading in time slot i, let g ∈ C i MEC server generally have sufficient communication and denote the uplink channel vector from user i to the AP and 6 computation capacities , it can adopt a large transmit power p the transmit power for offloading. Assume further that the i (accordingly high communication rate) and a high constant AP employs the maximum ratio combining (MRC) receiver CPU frequency to minimize the latency. In this case, the to decode the information. The achievable offloading rate (in AP’s energy consumption is generally proportional to the Í bits/sec) for user i is given by K totally offloaded bits ℓ from the K users. Therefore, we i i=1   adopt a simplified linear energy consumption model for the p g˜ i i r = Blog 1+ , ∀i∈K, (2) i 2 2 computation at the AP as Γσ 2 where B denotes the spectrum bandwidth, g˜ ,kgk denotes K i i Õ 2 the effective channel power gain from user i to the AP, σ E = α ℓ, (5) MEC i is the noise power at the receiver of the AP, and Γ≥ 1 is a i=1 constant accountingfor the gap from the channel capacity due where α denotes the energy consumption per offloaded bit at to a practical coding and modulation scheme. For simplicity, the AP. In practice, α depends on the transceiver structure of Γ = 1 is assumed throughout this paper. As a result, the the AP, the chip structure of the MEC server, and its operated number of offloaded bits ℓ by user i to the AP can be i CPU frequencies, etc. 4. expressed as 2) Local Computing at Users: Consider next the local ℓ = r t, ∀i∈K. (3) i i i computing for executing q input bits at each user i∈K. Let i C denote the number of CPU cycles required for computing Computation offloading incurs energy consumption at both i one input bit at user i. Then the total number of CPU cycles the K users and the AP. Per user i ∈ K, in addition to required for the q bits is C q . By applying dynamic voltage the transmit power p , a constant circuit power p (by the i i i i c,i and frequency scaling (DVFS) techniques 3, 4, user i can digital-to-analog converter (DAC), filter, etc.) is consumed. control the energy consumption for local task execution by The offloading energy consumption at user i is then E = offl,i adjusting the CPU frequency f for each cycle n, where (p + p )t . With simple manipulations based on (2) and (3), i,n i c,i i   max max 1 ℓ i f ∈(0, f , n∈1,...,C q, and f denotes user i’s i,n i i the transmit power p can be expressed as p = β , i i i i g˜ t i i 7 x maximum CPU frequency. With f ’s, the executiontime for i,n 2 B Í where β(x) , σ(2 − 1) is a monotonically increasing and C q i i 1 local computing at user i is . As each user i∈K 5 f n=1 i,n convex function with respect to x. Hence, the offloading energy consumption at user i is   6 In the case when the MEC server’s computing capacity is limited, the t ℓ i i computation offloading protocol needstoberedesigned, bytaking intoaccount E = β + p t . (4) offl,i c,i i g˜ t the computation time at the MEC server as well as the computation resource i i sharing among these different users. Under such a scenario, how to jointly Remark 2.1: Note that in practice, in order for the AP to design the WPT and MEC optimally is out of the scope of this paper. It is an interesting direction to pursue in the future work. acquire the CSI (to the K users) for the energy beamforming 7 Note that in practice, each CPU frequency f can only be an integer i,n design,eachuserneedstoconsumeacertainamountofenergy chosen from a finite set. However, such an integer constraint may make the   design problem a mixed-integer one that is NP-hard in general. To avoid this, ℓ 5 i Note that to avoid the issue of dividing by zero, we define β = 0 we model f as continuous variables to provide a performance upper-bound t i,n i when either ℓ = 0 or t = 0 holds. for the practical cases with discrete CPU frequencies. i i5 † needs to accomplish the task execution within a block, the f , f ,..., f . Mathematically, the latency- 1,1 K,C (R −ℓ ) K K K execution time cannot exceed the block length T, i.e., constrained energy minimization problem is formulated as K Õ C q i i Õ 1 (P1) : min Ttr(Q)+ αℓ (9a) i ≤ T, ∀i∈K. (6) Q0,t,ℓ,f f i,n i=1 n=1 C(R−ℓ) i i i Õ 1 Under the assumption of a low CPU voltage that normally s.t. ≤ T, ∀i∈K (9b) f i,n holds for low-power devices, the consumed energy for local n=1   C(R−ℓ) computing at user i∈K could be expressed as 35 i i i Õ t ℓ i i 2 κ f + β + p t −Tζtr(QH)≤ 0, i c,i i i i,n C q g˜ t i i i i Õ n=1 2 E = κ f , (7) loc,i i i,n ∀i∈K (9c) n=1 K Õ t ≤ T, t ≥ 0, 0≤ ℓ ≤ R, ∀i∈K (9d) where κ is the effective capacitance coefficient that depends i i i i i i=1 on the chip architecture at user i. max 0≤ f ≤ f , ∀n, ∀i∈K. (9e) i,n i Here, the constraints in (9b) and (9c) represent the K users’ C. Energy Harvesting Constraints at Users individuallocal computinglatency and energyharvestingcon- As each user i ∈ K is powered by the WPT from the straints, respectively. Note that due to the non-convex nature AP to achieve self-sustainable operation, the so-called energy of (9b) and (9c), problem(P1) is non-convex in the current harvesting constraint needs to be imposed such that the totally form. However, we can transform it into a convex form and consumed energy at the user cannot exceed the harvested find the well-structured optimal solution, as will be shown in energy E in (1) per block. By combining the computation i the next subsection. offloading energy in (4) and the local computation energy in (7), the total energy consumed by user i within the block is B. Optimal Solution to Problem(P1) 8 E +E . Therefore, we must have per user i∈K: offl,i loc,i In this subsection, we provide the optimal solution to the computationlatencyconstrainedenergyminimizationproblem E + E ≤ E . (8) loc,i offl,i i (P1).Tocopewiththenon-convexconstraintsin(9b)and(9c), we first establish the following lemma. III. COMPUTATION LATENCY CONSTRAINED ENERGY Lemma 3.1: Given the number of offloaded bits ℓ, the MINIMIZATION optimalsolutionofthelocalCPUfrequencies f ’stoproblem i,n (P1) should satisfy A. Problem Formulation f = ...= f = C(R−ℓ)/T, ∀i∈K. (10) i,1 i i i i,C(R−ℓ) i i i Underthe abovesetup, we pursuean energy-efficientMEC- WPT designbyconsideringa computationlatencyconstrained Proof: See Appendix A. energy minimization problem. Suppose that each user i∈K Lemma 3.1 indicates that at each user i ∈ K, the local has a computation task with R 0 input bits, which needs i CPU frequencies for different CPU cycles are identical at to be successfully executed before the end of the block. In the optimality. Hence, problem (P1) can be equivalently this case, the sum of the number of offloaded bits ℓ and the i reformulated as number of locally computedbits q should be equal to R , i.e., i i K Õ we have q = R−ℓ , ∀i∈K. i i i (P1.1) : min Ttr(Q)+ αℓ (11a) i Q0,t,ℓ We aim to minimize the energy consumption at the AP i=1 Í K K (including the energy consumption αℓ in (5) for com- i Õ i=1 putation and Ttr(Q) for WPT) while ensuring the success- s.t. t ≤ T (11b) i i=1 ful execution of the K users’ computation tasks per time   3 3 block. To this end, we jointly optimize the energy transmit κ C(R−ℓ) i i i t ℓ i i i + β + p t −Tζtr(QH)≤ 0 c,i i i covariance matrix Q at the AP, the local CPU frequencies 2 T g˜ t i i f ,..., f , and the numbers of offloaded bits ℓ ’s i,1 i,C(R−ℓ) i i i i ∀i∈K (11c) at the users, as well as the time allocation t ’s among dif- i † † 0≤ ℓ ≤ R, t ≥ 0, ∀i∈K. (11d) i i i ferent users. Let t , t ,...,t , ℓ , ℓ ,...,ℓ , and 1 K 1 K As β(x) is convex as a function of x ≥ 0, its perspective   8 Note that in (8) we consider that the totally consumed energy should not t ℓ i i function β is jointly convex with respect to t ≥ 0 and i g˜ t exceed the totally harvested one, instead of the “energy causality” in conven- i i tional energy harvesting communications (see, e.g., 13). This consideration ℓ ≥ 0 37. As a result, the energy harvesting constraints in i implies that at the beginning of the block each user has sufficiently large (11c) become convex. Furthermore, since the objective func- energy storage, such that the stored energy will never be used up at any time tion in (11a) is affine and the other constraints are all convex, within each block and the energy storage level will be refilled via energy harvesting by the end of each block. problem(P1.1) is convex and can thus be optimally solved6 by standard convex optimization techniques. Nevertheless, to where each subproblem i in (16) is for the user i∈K. Under gain engineering insights, we derive its optimal solution in a the condition ofF(λ) 0, it is evident that the optimal value ∗ semi-closed form by leveraging the Lagrange duality method of problem (15) is zero and its optimal solution Q can be 37. any positive semidefinite matrix in the null space of F(λ). ∗ Let µ ≥ 0 and λ ≥ 0 denote the dual variables associated Without loss of optimality, we simply set Q = 0 for the i with the time-allocationconstraint in (11b)and thei-th energy purpose of obtaining the dual functionΦ(λ,µ ) and computing 9 ∗ harvesting constraint in (11c), ∀i∈K, respectively. Then the the optimal dual solution. Note that Q = 0 is generally partial Lagrangian of(P1.1) is expressed as not the optimal primary solution to(P1.1). As a result, after opt opt finding the optimal dual solution(λ , µ ), we need to use K Õ an additional step to retrieve the optimal primary solution of L (Q,t,ℓ,λ,µ )=Ttr I− ζλH Q − µ T 1 i i opt Q to(P1.1), as will be shown in Section IV-C. i=1 Forthei-th subproblemin (16),it is convexandsatisfies the K 3 3 Õ λ κ C(R−ℓ) i i i i i Slater’s condition. Based on the Karush-Kuhn-Tucker (KKT) + αℓ + i 2 T ∗ ∗ conditions 37, one can obtain the optimal solution(t ,ℓ) to i=1 i i    (16) in a semi-closed form, as stated in the following lemma. λ t ℓ i i i + β +λ p t + µ t , (12) i c,i i i Lemma 3.2: For any given(λ,µ )∈S, the optimal solution g˜ t i i ∗ ∗ (t ,ℓ) to problem (16) can be obtained as follows. i † where λ ,λ ,...,λ . Accordingly, the dual function is 1 K ∗ ∗ • If λ = 0, we have ℓ = 0 and t = 0; i i i given by • If λ 0, we have i Φ(λ,µ )= min L(Q,t,ℓ,λ,µ ) (13) " s +   Q0, t, ℓ ∗ 2 2 r T α σ ln2 i ∗ B ℓ = R− + 2 (17) s.t. 0≤ ℓ ≤ R, t ≥ 0, ∀i∈K. i i i i i 3 λ Bg˜ 3κ C i i i i ∗ ∗ ∗ Consequently, the dual problem of(P1.1) is t = ℓ/r , (18) i i i       (D1.1) : max Φ(λ,µ ) (14a) g˜ µ ∗ B i 1 λ, µ where r , W + p − + 1 denotes 0 2 c,i i ln2 σ e λ e i K Õ the offloadingrate of useri, W(x) is the principalbranch 0 s.t. F(λ),I− ζλH  0 (14b) i i of the Lambert W function defined as the solution for i=1 W(x) 0 W(x)e = x 36, and e is the base of the natural 0 µ ≥ 0, λ ≥ 0, ∀i∈K. (14c) i logarithm. Note that the constraint F(λ) 0 is necessary to ensure the Proof: See Appendix C. ∗ dual functionΦ(λ,µ ) to be boundedfrom below (as provedin By combining Lemma 3.2 and Q = 0, the dual function AppendixB).Wedenotethefeasiblesetof(λ,µ )characterized Φ(λ,µ ) can be evaluated for any given(λ,µ )∈S. by (14b) and (14c) asS. opt opt 2) Obtaining the Optimal λ and µ to Maximize Since problem(P1.1) is convex and satisfies the Slater’s ∗ ∗ ∗ Φ(λ,µ ): Having obtained (Q ,ℓ ,t) for given λ and µ , condition, strong duality holds between(P1.1) and its dual we can next solve the dual problem (D1.1) to maximize problem(D1.1) 37. As a result, we can solve(P1.1) by Φ(λ,µ ). Note that the dual function Φ(λ,µ ) is concave but equivalently solving(D1.1). In the following, we first obtain non-differentiable in general 37. Hence, we use subgradient the dual function Φ(λ,µ ) for any given (λ,µ ) ∈ S, and based methods, e.g., the ellipsoid method 38, to obtain the then find the optimal dual variables λ and µ to maximize opt opt optimal λ and µ for problem(D1.1). The basic idea of Φ(λ,µ ) using the ellipsoid method 38. For convenience of the ellipsoid method is to find a series of ellipsoids to localize ∗ ∗ ∗ presentation, let (Q ,t ,ℓ) denote the optimal solution to opt opt the optimal dual solutionλ and µ 38. To start with, we opt opt opt problem (13) for given λ and µ ,(Q ,t ,ℓ ) denote the choose a given(λ,µ )∈S as the center of the initial ellipsoid opt opt optimal primary solution to(P1.1), and(λ ,µ ) denote the opt opt andsetitsvolumetobesufficientlylargetocontain(λ ,µ ). optimal dual solution to(D1.1). Then, at each iteration, we update the dual variables(λ,µ ) 1) Evaluating the Dual FunctionΦ(λ,µ ): First, we obtain based on the subgradients of both the objective function and the dual function Φ(λ,µ ) in (13) for any given(λ,µ )∈S. the constraint functions, and accordingly construct a new To this end, problem (13) can be decomposed into(K + 1) ellipsoid with reduced volume. When the volume of the subproblems as follows, one for optimizing Q and the other ellipsoidisreducedbelowacertainthreshold,theiterationwill K for jointly optimizing t ’s and ℓ ’s. i i terminate and the center of the ellipsoid is chosen to be the opt opt optimal dual solution(λ ,µ ). More details can be referred min tr(QF(λ)) s.t. Q 0. (15) Q to in 38.   3 3 To implement the ellipsoid method, it remains to determine λ κ C(R−ℓ) i i i i λ t ℓ i i i i min αℓ + + β +λ p t + µ t i i c,i i i the subgradients of both the objective function in (14a) and 2 t ,ℓ T g˜ t i i i i (16a) 9 ∗ Note thatQ =0 is not a unique optimal solution to problem (15) when s.t. 0≤ ℓ ≤ R, t ≥ 0, (16b) i i i F(λ) is rank-deficient, i.e., rank(F(λ)) N.7 the constraint functions in (14b) and (14c). For the objective where function Φ(λ,µ ) in (14a), one subgradient is given by 38 opt g˜ 1 " B µ opt i   3 ∗ 3 ∗ ∗ r , W + p − + 1 (25) 0 c,i κ C(R −ℓ) t ℓ 1 1 i opt 2 1 1 1 1 ∗ ln2 σ e e λ + β + p t ,..., c,1 i ∗ 1 2 T g˜ t 1 1 corresponds to the optimal offloading rate for user i, ∀i∈K. †   K 3 ∗ 3 ∗ ∗ Õ κ C (R −ℓ ) t ℓ K K Proposition 3.1 can be verified by simply combining Lem- K K K K ∗ ∗ + β + p t , t −T . c,K K i ∗ 2 mas 3.1 and 3.2; hence, we omit its detailed proof for g˜ t T K K i=1 conciseness. Note that (24) is an instance of SDP, which can (19a) thus be efficiently solved by off-the-shelf solvers, e.g., CVX As for the constraint F(λ)  0 in (14b), we have the 39. following lemma. Summarizing, we present Algorithm 1 to solve the compu- N×1 Lemma 3.3: Let v ∈ C be the eigenvector corre- tation latency constrained energy minimization problem(P1). sponding to the smallest eigenvalue of F(λ), i.e., v = H argmin ξ F(λ)ξ. Then the constraint F(λ)  0 is kξk=1 Algorithm 1 for Solving the Energy Minimization Problem H equivalent to the constraint of v F(λ)v ≥ 0, and the (P1) H subgradient of v F(λ)v at the given λ and µ is 1: Initialization: Given an ellipsoidE((λ,µ ),A) containing   † H H opt opt ζv H v,...,ζv H v,0 . (20) (λ ,µ ), where(λ,µ ) is the center ofE and A≻ 0 1 K characterizes the volume ofE. Proof: See Appendix D. 2: Repeat: Furthermore, the subgradient of λ ≥ 0 in (14c) is given i ∗ ∗ (K+1)×1 • For each user i ∈K, obtain(t ,ℓ) by Lemma 3.2 by the elementary vector e ∈R (i.e., e is of all zero i i i i under given λ and µ ; i entries except for the i-th entry being one),∀i∈K, while that • Compute the subgradients of the objective function of µ ≥ 0 is e . By using this together with (19) and (20), K+1 and the constraints of(D1.1) as in Section III-B.2; the ellipsoidmethodcanbe appliedto efficientlyupdateλ and opt opt • Update λ and µ using the ellipsoid method 38; µ towards the optimal λ and µ for(D1.1). 3) Finding the Optimal Primary Solution to(P1): With 3: Until λ and µ converge within a prescribed accuracy. opt opt opt opt λ and µ obtained, it remains to determine the optimal 4: Set(λ ,µ )←(λ,µ ). opt opt opt opt primarysolutionto(P1.1)(orequivalently(P1)).Specifically, 5: Output: Obtain(t ,ℓ ),f , and compute Q by i,n opt opt by replacing λ and µ with λ and µ in Lemma 3.2, one (24). opt opt can obtain the optimal(t ,ℓ ) for(P1) in a semi-closed opt form. Furthermore, by substituting ℓ in Lemma 3.1, one Remark 3.1: Proposition 3.1 shows that the optimal joint opt can then obtain the optimal local CPU frequenciesf for i,n computing and offloading design has the following interesting the K users. However, we cannot directly obtain the optimal properties to minimize the energy consumption at the AP. opt energy transmit covariance matrix Q for(P1) from the 1) First, if the energy harvesting constraint is not tight for solution to problem (15), since its solution is non-unique in user i (i.e., user i harvests sufficient wireless energy), opt general. Therefore, we adopt an additional step to obtainQ then no computation offloading is required and user i by solving a semidefinite program (SDP), which corresponds opt should compute all the tasks locally (i.e., ℓ = 0). This opt opt i to solving problem(P1.1) for Q under the given(t ,ℓ ). can be explained based on the complementaryslackness We can then readily establish the following proposition. condition 37, i.e., Proposition 3.1: The optimal solution opt opt opt opt opt opt 3 3   (f ,Q ,t ,ℓ ) for problem(P1) is given by κ C(R−ℓ ) t i i opt opt opt i,n i i i λ + β r + p t c,i s i i i " 2 + T g˜   i  opt  r  2 2 i opt T α σ ln2   B R− + 2 , if λ 0, opt i 3 opt Bg˜ i  3κ C λ i ℓ = i i opt i i −Tζtr Q H = 0, ∀i∈K. (26)  i   opt 0, if λ = 0,  i (21) In this case, if the energy harvesting constraint is not ( opt opt opt opt tightforuseri, then basedon(26)we have λ = 0,and ℓ /r , if λ 0, opt i i i i t = (22) opt opt i accordinglyℓ = 0holdsfrom(21).Thispropertyisin- 0, if λ = 0, i i tuitive:whentheuserhassufficientenergytoaccomplish opt opt opt f = ...= f = C(R−ℓ )/T, ∀i∈K, (23) opt i i i,1 i thetaskslocally,thereis noneedtoemploycomputation i,C(R−ℓ ) i i i offloadingthat incurs additional energyconsumptionfor and the MEC server’s computation at the AP. opt Q = argmin Ttr(Q) 2) Next, it is always beneficial to leave some bits for local Q0 opt computing at each user i ∈K, i.e., ℓ R always i opt opt 3 3 i   κ C(R−ℓ ) t i i opt opt i i i holds (see (21)). In other words, offloading all the bits s.t. + β r + p t c,i 2 i i T g˜ i to the AP is always suboptimal. This is because when opt −Tζtr(QH)≤ 0, ∀i∈K, (24) ℓ → R , the marginal energy consumption of local i i i8 computing is almost zero, and thus it is beneficial to 5 10 leave some bits for local computing in this case. Local computing only 3) Furthermore, it is observed that for each user i, more Full offloading only Joint design w/ isotropic WPT stringent the energy harvesting constraint is (or the opt Separate design 4 associated dual variable λ is larger), more bits should 10 i Proposed joint design be offloaded to the AP with a smaller offloading rate opt r . This property follows based on (21) and (25), in i opt opt 3 which a larger λ admits a larger ℓ and a smaller i i 10 opt r . i opt 4) Finally, the number of offloaded bits ℓ and the of- i opt floading rate r for each user i are affected by the 2 i 10 channel gain g˜ , the block length T, the circuit power i p , and the MEC energy consumption α per offloaded c,i bit in the following way: 1) when the channel condition 1 opt 10 becomes better (i.e., g˜ becomes larger), both ℓ and 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 i i opt Time block length, T (sec) r increase,andthususeri islikelytooffloadmorebits i with a higher offloading rate; 2) a higher circuit power Fig. 3. The average energy consumption at the AP versus the block length opt p at the user leads to a higher offloading rate r ; 3) c,i T. i opt when T or α increases, ℓ reduces and thus fewer bits i are offloaded to the AP. MHz. By considering a Rayleigh fading channel model, the wireless channel from the AP to each user i∈K is set as IV. NUMERICAL RESULTS −3 −3 ¯ h = θ d h , g = θ d g ¯ , (27) i 0 i 0 i i i i In this section, numerical results are provided to gauge ¯ where h ∼ CN(0,I) and g ¯ ∼ CN(0,I), ∀i ∈ K, is the performance of the proposed design with joint WPT, i i an independent and identically distributed (i.i.d.) circularly offloading, and computing optimization, as compared to the symmetric complex Gaussian (CSCG) random vector with following four benchmark schemes. −4 zero mean and covariance I; θ = 6.25× 10 (i.e.,−32 0 1) Local computing only: each user i∈K accomplishes its dB) corresponds to the channel power gain at a reference computation task by only local computing. This scheme distance of one meter; d denotes the distance from the AP i corresponds to solving problem(P1) by setting ℓ = 0, i to user i ∈ K; and the path-loss exponent is assumed to ∀i∈K. be 3. The numerical results are obtained by averaging over 2) Computation offloading only: each user i ∈ K ac- 500 randomized channel realizations. Note that the simulation complishes its computation task by fully offloading the parameters are specifically chosen, but our approaches can be computation bits to the AP. This scheme corresponds to also applied to other system setups. solving(P1) by setting f = 0, ∀n, ∀i∈K, as well as i,n ℓ = R for(P1), ∀i∈K. i i 3) Joint design with isotropic WPT: the N-antenna AP ra- A. Case with Homogeneous Users diates the RF energy isotropically or omni-directionally First, we consider the case with homogeneous users, where over all directions by setting Q = pI, where p ≥ 0 the distances from the AP to all the users are identical with denotesthetransmitpowerateachantenna.Thisscheme d = 5 meters, ∀i∈K. The corresponding average power loss i −6 corresponds to solving problem(P1) by replacingQ as issettobe 5×10 (i.e.,−53dB).Additionally,thenumbersof pI with p being another optimization variable. computationbitsatallusersaresettobeidentical,i.e., R= R , i 4) Separate MEC-WPT design: this scheme separately de- ∀i∈K. Figs. 3–6 show the average energy consumption at signs the computation offloading for MEC and the en- the AP under different system parameters. It is observed that ergybeamformingforWPT 21,32.First, the K users the proposed joint design achieves the lowest average energy minimize their sum-energy consumption subject to the consumption at the AP among all the five schemes. The joint users’ individual computation latency constraints 32. designwithisotropicWPT achievesasuboptimalperformance Then, under the constraints of energy demand at the K due to the loss of multi-antenna energy beamforming gain. users, the AP designs the transmit energy beamforming The suboptimal performance of the separate-design scheme with minimum energy consumption 21. implies the necessity of unified demand-supply optimization In the simulations, the EH efficiency is set as ζ = 0.3. in wireless powered MEC systems. The system parameters are set as (unless stated otherwise): Fig. 3 shows the average energy consumption at the AP 3 the number of the AP antennas N = 4, C = 10 cycles/bit, versus the time block length T, where R = 10 kbits and i −28 −4 κ = 10 ,∀i∈K 32, the circuit power p = 10 Watt K = 10. First, with a small value of T (e.g.,T = 0.05 sec), the i c,i (W), the energy consumption per offloaded bit by the MEC benchmark schemes but the local-computing-only scheme are −4 2 server α = 10 Joule/bit, the receiver noise power σ = observed to achieve a near optimal performance close to that −9 10 W, and the spectrum bandwidth for offloading B = 2 withtheproposedjointdesign,whileasT increases,theenergy Average energy consumption (Joule)9 3 3 10 10 Local computing only Full offloading only Joint design w/ isotropic WPT 2 Separate design 10 Proposed joint design 2 1 10 10 Local computing only 0 Full offloading only 10 Joint design w/ isotropic WPT Separate design Proposed joint design 1 -1 10 10 5 10 15 20 25 2 4 6 8 10 12 14 16 18 20 User number, K Number of computation bits per user, R (kbits) Fig. 4. The average energy consumption at the AP versus the user number Fig. 5. The average energy consumption at the AP versus the number of K. computation bits R at each user. consumption with the local-computing-only scheme signifi- 70 cantly decreases, approaching that with the proposed joint Local computing only design. It is also observed that the energy consumption with Full offloading only 60 Joint design w/ isotropic WPT the full-offloading-only scheme remains almost unchanged Separate design 50 when T≥ 0.1 sec. This is due to the fact that in this case, the Proposed joint design optimal offloading time for all users is fixed to be around 0.1 sec for saving the circuit energy consumption; hence, 40 increasing T cannot further improve the energy efficiency in this case. By contrast, the energy consumption with the local-computing-only scheme decreases monotonically as T 30 increases.ThisisbecauseasT increases,onecanalwayslower down the CPU frequency to save energy for local computing. Finally, it is seen in Fig. 3 both the separate-design and the equal-offloading-time-allocation schemes achieve a very 20 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 similar performance in the interested time block regime. Spectrum bandwith for offloading per user, B (MHz) Fig. 4 depicts the average energy consumption versus the user number K, where R = 10 kbits and T = 0.5 sec. It is Fig. 6. The average energy consumption at the AP versus the spectrum shown that the gain achieved by the proposed joint design bandwidth B for offloading. becomes more significant as the user number K becomes large. The full-loading-only scheme outperforms the local- consumption. Furthermore, the full-offloading-only scheme is computing-only scheme, but with a decreasing gain as K observed to achieve a near optimal performance close to that increases. This is because in the full-offloading-only scheme, with the proposed joint design when R becomes large. This all users share the finite time block and the offloading energy is because the energy consumption per bit for offloading is consumption would increase drastically when K becomes significantly smaller than that for local computing in the large large. It is also observed that the performance of the equal- R case. It is also observed that the separate-design scheme offloading-time-allocation scheme becomes closer to that of outperforms all the other benchmark schemes in this setup. the proposed joint design with larger K≥ 15. This indicates that an equal offloading time is desirable for a large number Fig. 6 shows the average energy consumption versus the of the users in order to minimize the energy consumption at spectrum bandwidth B for offloading, where K = 6, T = the AP. 0.5 sec, and R = 50 kbits. As expected, Fig. 6 shows that Fig. 5 shows the average energy consumption at the AP the energy consumption by the four schemes with offloading versus the number of computation bits R at each user, where decreases as B increases, and the one by the local-computing- K = 2 and T = 0.05 sec. It is shown that the average energy only scheme remains unchanged. This indicates that a large consumption by all the six schemes increases as R becomes value of B not only implies a high offloading rate, but also large, and the full-offloading-only scheme outperforms the helps save the energy consumption in computation offloading. local-computing-only one, especially when R becomes large. Itisalsoobservedthatatsmall Bvalues(e.g.,B≤ 3MHz),the Thisindicatesthatwithlarge Rvalues,itisdesirabletooffload local-computing-onlyscheme outperforms the full-offloading- more computationbits to the AP in order to reduce the energy only scheme, but it does not hold for large B cases. This Average energy consumption (Joule) Average energy consumption (Joule) Average energy consumption (Joule)10 TABLE I OFFLOADEDBITS AND RESIDUALENERGY AT USERS FOR THE PROPOSEDOPTIMALJOINT DESIGNUNDER DIFFERENTd . 2 d (meters) 2 3 4 5 6 7 8 2 opt Near user’s offloaded bits ℓ (kbits) 0.165 0.142 0.124 0.092 0.068 0.028 0.006 1 opt Far user’s offloaded bits ℓ (kbits) 1.798 6.974 11.817 13.682 13.585 12.972 12.162 2 −5 Near user’s residual energy (×10 Joule) 0.007 0.026 0.062 0.531 3.276 9.218 21.105 −5 Far user’s residual energy (×10 Joule) 0.003 0 0 0 0 0 0 TABLE II OFFLOADEDBITS AND RESIDUALENERGY AT USERS FOR THE PROPOSEDOPTIMALJOINT DESIGNUNDER DIFFERENTR . 2 R (kbits) 10 20 30 40 2 opt Near user’s offloaded bits ℓ (kbits) 0.825 0.432 0.239 0.282 1 opt Far user’s offloaded bits ℓ (kbits) 3.586 13.62 23.791 33.264 2 −5 Near user’s residual energy (×10 Joule) 0.426 3.317 6.42 9.545 −5 Far user’s residual energy (×10 Joule) 0 0 0 0 d 3 meters. Furthermore, the proposed joint design is 2 80 observed to achieve a significant performance gain over the separate-design one when d 4 meters. 2 Local computing only 70 Full offloading only To providemore insights, Table I demonstratesthe numbers opt Joint design w/ isotropic WPT 60 of offloaded bits ℓ s at both users and their residual energy i Separate design (i.e., E − E − E ) under different values of d for the Proposed joint design i loc,i offl,i 2 50 proposed joint design. It is observed that the far user prefers offloadingsignificantlymorebits thanthenearuser,especially 40 at a larger d . As d increases, the number of offloaded 2 2 opt 30 bits ℓ by the near user decreases significantly, while that 1 opt by the far user (i.e., ℓ ) increases. This result is generally 20 2 consistent with the first property in Remark 3.1. Furthermore, 10 itis observedthattheresidualenergyatthenearuserincreases dramaticallyas d increases, while the far user always uses up 2 0 2 3 4 5 6 7 8 all its energy when d d . This shows that as d increases, 2 1 2 Distance between the AP and the far user, d (meters) the energy consumptionincrease at the AP in Fig. 7 is mainly 2 to satisfy the energy requirement at the far user. In this case, Fig. 7. The average energy consumption at the AP versus the distance d 2 the near user will harvest a lot of energy. from the AP to the far user. Furthermore, we consider the case when the near and far users have distinct computation task sizes. Fig. 8 depicts the indicates that offloading becomes a better option than local- average energy consumption versus the computation task size computing as B increases. R at the far user, where R = 20 kbits and d = 6 meters. It 2 1 2 is observed that the energy consumption by the six schemes B. Case with Heterogeneous Users increases as R increases, and both the local-computing-only 2 and the separate-design schemes lead to much higher energy Next, we evaluate the performance of the wireless powered consumptionthan the other four schemes when R 20 kbits. 2 MEC system in the case with heterogeneous users. For the This is due to the fact that in the local-computing-only and purpose of illustration, we focus on the scenario with only the separate-design schemes, the far user cannot explore the K = 2 users. It is assumed that the distances from the AP to benefit of task offloading for energy saving. Among the five the two users (namely near and far users) are d = 2 meters 1 benchmark schemes, the full-offloading-onlyscheme achieves and d ≥ 2 meters, respectively. The time block length is set 2 the best performance close to the optimal proposed one. as T = 0.2 sec. Table II presents the numbers of offloaded bits at both users Fig. 7 shows the average energy consumption versus the and their residual energy for the proposed joint design. It is distance d fromtheAP to thefaruser,wherethecomputation 2 opt observedthatas R increases,thenumberofoffloadedbits ℓ 2 task sizes for both users are set as R = R = 20 kbits. It 1 2 1 opt by the near user decreases, while ℓ by the far user increases is observed that as d increases, the energy consumption by 2 2 opt significantly. Similarly as in Table I, ℓ is observed to be the six schemes increases significantly, and the proposed joint 2 opt design achieves the lowest energy consumption among them. significantly larger than ℓ . It is also observed that with R 2 1 The local-computing-only scheme is observed to outperform increasing, the residual energy at the near user becomes more the full-offloading-only scheme when d 3 meters, but significant, while that at the far user is zero. 2 performs inferior to the full-offloading-only scheme when Tables I and II show that when the users are heterogeneous Average energy consumption (Joule)11 are convex functions with respect to x 0, based on Jensen’s 200 inequality 37, it follows that Local computing only C(R−ℓ) i i i 175 Full offloading only Õ Joint design w/ isotropic WPT C(R−ℓ)/f ≤ 1/f (28a) i i i i i,n Separate design 150 n=1 Proposed joint design C(R−ℓ) i i i Õ 125 2 2 C(R−ℓ)κ f ≤ κ f , (28b) i i i i i i i,n 100 n=1 where both the equalities hold if and only if 75 f = ...= f , ∀i∈K. (29) i,1 i,C(R−ℓ) i i i 50 As a result, the optimality of problem(P1) is achieved when 25 (29)holds.Therefore,byreplacing f , f ,∀n,problem(P1) i i,n is equivalently expressed as 0 10 15 20 25 30 35 40 K Õ Computation task size of the far user, R (kbits) 2 min Ttr(Q)+ αℓ (30a) i Q0,t,ℓ,f i i=1 Fig. 8. The average energy consumption at the AP versus the computation task size R kbits. s.t. C(R−ℓ)/f ≤ T, ∀i∈K (30b) i i i i 2   t ℓ i i 2 κ C(R−ℓ)f + β + p t i i i i c,i i i g˜ t i i in locations and/or task sizes, even the optimal joint design −Tζtr(QH)≤ 0, ∀i∈K (30c) i still leads to unbalanced energy demand and supply at these K Õ users. Inparticular,the AP needs touse a largetransmit power t ≤ T, t ≥ 0, 0≤ ℓ ≤ R, ∀i∈K. (30d) i i i i to satisfy the high energy demand of users that are far apart i=1 and/or have heavy computation tasks. At the same time, the For a given(t,ℓ), it is evident that the optimal f ’s for (30) nearby users with light computation tasks can accordingly i (equivalent(P1)) should be as small as possible by (30c). harvest more energy and are likely to have energy surplus. Since f is boundedbelowbyC(R−ℓ)/T in(30b),it follows To better balance the energy demand and surplus, it can be i i i i that the optimal f ’s are viable to enable user cooperation between near and far users, i which is an interesting research direction worth pursuing in f = C(R−ℓ)/T, ∀i∈K. (31) i i i i future work. It then readily follows that, at optimum of(P1), f = ... = i,1 f = C(R−ℓ)/T, ∀i∈K. i,C(R−ℓ) i i i i i i V. CONCLUSION We developed a unified MEC-WPT design framework with B. Proof of F(λ) 0 joint energy beamforming, offloading, and computing opti- F(λ) 0 can be verified by contradiction. Assume that mization in emerging wireless powered multiuser MEC sys- F(λ)isnotpositivesemidefinite.Denotebyξ oneeigenvector tems. In particular, we proposed an efficient wireless powered corresponding to the negative eigenvalue of F(λ). By setting H multiuser MEC design by considering the latency-constrained Q= τξξ  0 with τ going to infinity (which is feasible for computation, for which the AP minimizes the total energy (13)), it follows that consumption subject to the users’ individual computation la- H lim tr(QF(λ))= lim τξ F(λ)ξ=−∞, (32) tency constrains. Leveragingthe Lagrange duality method, we τ→+∞ τ→+∞ obtainedtheoptimalsolutioninasemi-closedform.Numerical which in turn implies that the objective value in (13) is results demonstrated the merits of the proposed joint design unbounded below over Q  0. Therefore, in order for the over alternative benchmark schemes. The proposed unified dual function value Φ(λ,µ ) to be bounded below, we need MEC-WPT design can pave the way to facilitate ubiquitous F(λ) 0. computing for IoT devices in a self-sustainable way. C. Proof of Lemma 3.2 APPENDICES Given(λ,µ ) ∈ S, we solve problem (16) for each user i∈K. When λ = 0, the objective function in (16) becomes i A. Proof of Lemma 3.1 ∗ ∗ αℓ + µ t . It is evident that t = 0 and ℓ = 0 are optimal for i i i i First, consider the case when there exists some user i with (16). ℓ = R , i.e., user i offloads all of its computation task bits to i i For λ 0, the Lagrangian of (16) is given by i the AP. As user i does not perform local computing in this   3 3 λ κ C(R−ℓ) i i i i λ t ℓ i i i i case, the local CPU frequency of user i is evidently zero. L = αℓ + + β +λ p t i i i c,i i 2 T g˜ t We nextconsiderthe nontrivialcase of 0≤ ℓ R ,∀i∈K. i i i i Í C (R −ℓ ) i i i f i,n n=1 2 + µ t +γ(ℓ− R)−ν ℓ−η t , (33) i i i i i i i i Define f , , ∀i∈K. Since that both 1/x and x i C(R−ℓ) i i i Average energy consumption (Joule)12 where γ , ν , and η are the non-negativeLagrangian multipli- where the last equality follows from the affine structure of i i i ers associated with ℓ ≤ R , ℓ ≥ 0, and t ≥ 0, respectively. F(·) in (14b). By the weak subgradient calculus 37, the i i i i BasedontheKKTconditions37,thenecessaryandsufficient subgradient of F(λ) at the given λ and µ is then ∗ ∗ ∗ ∗ ∗ conditions for the optimal primal-dual point(t ,ℓ ,γ ,ν ,η) i i i i i H H † ζv H v,...,ζv H v,0 , (40) 1 K are ∗ ∗ where v is the eigenvector corresponding to the smallest t ≥ 0, 0≤ ℓ ≤ R (34a) i i i ∗ ∗ ∗ eigenvalue of F(λ), and the last zero entry follows from the γ ≥ 0, ν ≥ 0, η ≥ 0 (34b) i i i fact that π(λ) is independent of µ . ∗ ∗ ∗ ∗ ∗ ∗ γ(ℓ − R)= 0, ν ℓ = 0, η t = 0 (34c) i i i i i i i      ∗ ∗ ∗ ℓ ℓ ℓ λ i i i ′ i ∗ β − β +λ p + µ −η = 0 (34d) i c,i REFERENCES i ∗ ∗ ∗ g˜ t t t i i i i   3 ∗ 2 ∗ 1 F. Wang, J. Xu, X. Wang, and S. Cui, “Joint offloading and computing 3λ κ C(R−ℓ) ℓ λ i i i i i i ′ i ∗ ∗ optimization in wireless powered mobile-edge computing systems,” in α− + β +γ −ν = 0, (34e) ∗ i i 2 T g˜ t i Proc. IEEE ICC, Paris, France, May 2017. i 2 M. Chiang and T. Zhang, “Fog and IoT: An overview of research 2 x ′ σ ln2 opportunities,” IEEE Internet Thing J., vol. 3, no. 6, pp. 854–864, Jun. B where β(x) , 2 is the first-order derivative of β(x) B 2016. with respect to x. Note that (34c) denotes the complementary 3 Y. Mao, C. You, J. Zhang, K. Huang, and K. B. Letaief, “A servey slackness condition, while the left-hand-side terms of (34d) on mobile edge computing: The communication perspective,” to appear IEEE Commun. Survey Tuts. 2017. Online. Available: https://arxiv.org/ and (34e) are the first-order derivatives ofL with respect to i abs/1701.01090 ∗ ∗ ′ t and ℓ , respectively. For the function y = β(x)− xβ(x) of i i 4 S. Barbarossa, S. Sardellitti, and P. D. Lorenzo, “Communicating while x 0, its inverse function can be shown to be 36 computing: Distributed mobile cloud computing over 5G heterogenous     networks,” IEEE Signal Process. Mag., vol. 31, no. 6, pp. 45–55, Nov. B y 1 2014. x = W − − +1 . (35) 0 2 5 E. Cuervo, A. Balasubramanian, D. Cho, A. Wolman, S. Saroiu, R. ln2 σ e e Chandra, and P. Bahl, “MAUI: Making smartphones last longer with ∗ ∗ ∗ ∗ Let r , ℓ/t . From (34b) and (34d), we have β(r)− code offload,” in Proc. ACM MobiSys, San Francisco, USA, Jun. 2010, i i i i   pp. 49–62. µ ∗ ′ ∗ r β(r)=−g˜ + p . Based on (35), it follows that i c,i i i 6 S. Kosta, A. Aucinas, P. Hui, R. Mortier, and X. Zhang, “ThinkAir: λ i Dynamic resource allocation and parallel execution in the cloud for       mobile code offloading,” in Proc. IEEEINFOCOM, Orlando, USA,Mar. B g˜ µ 1 i ∗ r = W + p − +1 . (36) 0 c,i 2012, pp. 945–953. i 2 ln2 λ e σ e i 7 ETSI white paper (2015). Mobile edge computing: A key technol- ogy towards 5G. Online. Available: http://www.etsi.org/images/files/ 1 Since W(x) is a monotonicallyincreasing functionof x≥− 0 e ETSIWhitePapers/etsi_wp11_mec_a_key_technology_towards_5g.pdf 1 ∗ and W(− ) =−1 36, it follows that r 0 with non-zero 8 D. J. Love, R. W. Heath, V. K. N. Lau, D. Gesbert, B. D. Rao, and M. 0 e i Andrews, “An overview of limited feedback in wireless communication p . From (34c) and (34e), it is immediate that c,i systems,” IEEE J. Sel. Areas Commun., vol. 26, no. 8, pp. 1341–1365, " s +   Oct. 2008. ∗ 2 r 1 α σ ln2 i 9 R. Zhang and C. K. Ho, “MIMO broadcasting for simultaneous wireless ∗ B ℓ = R− + 2 . (37) i i 3 information and power transfer,” IEEE Trans. Wireless Commun., vol. λ Bg˜ 3κ C i i i i 12, no. 5, pp. 1989–2001, May 2013. ∗ ∗ 10 D. W. K. Ng, E.S. Lo, and R. Schober, “Robust beamforming for secure With (36) and (37), the optimal t is then obtained as t = i i communication in systems with wireless power transfer,” IEEE Trans. ∗ ∗ ℓ/r . Wireless Commun., vol. 13, no. 8, pp. 4599–4615, Aug. 2014. i i 11 S. Timotheou, I. Krikidis, G. Zheng, and B. Ottersten, “Beamforming forMISOinterference channels with QoSand RFenergy transfer,” IEEE D. Proof of Lemma 3.3 Trans. Wireless Commun., vol. 13, no. 5, pp. 2646–2658, May 2014. 12 S. Bi, R. Zhang, and C. Ho, “Wireless powered communication: Op- The positive semidefinite constraint F(λ)  0 can be portunities and challenges,” IEEE Commun. Mag., vol. 53, no. 4, pp. equivalentlyexpressed as a scalar inequality constraint as 37 117–125, Apr. 2015. 13 H. Li, J. Xu, R. Zhang, and S. Cui, “A general utility optimization H framework for energy harvesting based wireless communications,” IEEE π(λ), min ξ F(λ)ξ≥ 0. (38) kξk=1 Commun. Mag., vol. 53, no. 4, pp. 79–85, Apr. 2015. 14 J. Xu, L. Liu, and R. Zhang, “Multiuser MISO beamforming for † Given a query point λ , λ ,...,λ , one can find 1 1,1 1,K simultaneous wireless information and power transfer,” IEEE Trans. Signal Process., vol. 62, no. 18, pp. 4798–4810, Sep. 2014. the normalized eigenvectorv ofF(λ) corresponding to the 1 1 15 F. Wang, T. Peng, Y. Huang, and X. Wang, “Robust transceiver opti- smallest eigenvalue of F(λ) (i.e., π(λ)). Consequently, we 1 1 mization for power-splitting based downlink MISO SWIPT systems,” can determine the value of the scalar constraint at a query IEEE Signal Process. Lett., vol. 22, no. 9, pp. 1492–1496, Sep. 2015. H 16 F. Wang, C. Xu, Y. Huang, X. Wang, and X.-Q. Gao, “REEL-BF design: point as π(λ) = v F(λ)v . To obtain a subgradient, we 1 1 1 1 Achieving the SDP bound for downlink beamforming with arbitrary have shaping constraints,” IEEE Trans. Signal Process., vol. 65, no. 10, pp. 2672–2685, May 2017. H H π(λ)−π(λ)= min ξ F(λ)ξ−v F(λ)v 1 1 1 1 17 Y. Zeng and R. Zhang, “Optimized training design for wireless energy kξk=1 transfer,” IEEE Trans. Commun., vol. 63, no. 2, pp. 536–550, Feb. 2015. H ≤v (F(λ)−F(λ))v (39a) 18 J.Xuand R.Zhang, “Energybeamforming with one-bit feedback,” IEEE 1 1 1 Trans. Signal Process., vol. 62, no. 20, pp. 5370–5381, Oct. 2014. K Õ  19 J. Xu and R. Zhang, “A general design framework for MIMO wireless H = λ −λ ζv Hv , (39b) 1,i i i 1 1 energy transfer with limited feedback,” IEEE Trans. Signal Process., i=1 vol. 64, no. 10, pp. 2475–2488, May 2016.13 20 C. Valenta and G. Durgin, “Harvesting wireless power: Survey of FengWang(M’16) received the B.Eng. degree from energy-harvester conversion efficiency in far-field, wireless power trans- Nanjing University of Posts and Telecommunica- fer systems,” IEEE Microw. Mag., vol. 15, no. 4, pp. 108–120, Jun. tions, China, in 2009, and the M.Sc. and Ph.D. 2014. degrees, both from Fudan University, China, in 2012 21 Y.Zeng, B.Clerckx, andR. Zhang,“Communications andsignals design and 2016, respectively. He is currently an Assistant for wireless power transmission,” IEEE Trans. Commun., vol. 65, no. 5, Professor with the School of Information Engineer- pp. 2264–2290, May 2017. ing, Guangdong University of Technology, China. 22 S. Lee and R. Zhang, “Distributed wireless power transfer with energy From 2012 to 2013, he was a Research Fellow feedback,” IEEE Trans. Signal Process., vol. 65, no. 7, pp. 1685–1699, with the Department of Communication Technology, Apr. 2017. Sharp Laboratories of China. From January 2017 23 E. Boshkovska, D. W. K. Ng, N. Zlatanov, and R. Schober, “Practical to September 2017, he was a Postdoctoral Research non-linear energy harvesting model and resource allocation for SWIPT Fellow with the Engineering Systems and Design Pillar, Singapore University systems,” IEEE Commun. Lett., vol. 19, no. 12, pp. 2082–2085, Dec. of Technology and Design. His research interests include signal processing 2015. for communications, wireless power transfer, and edge computing. 24 B. Clerckx and E. Bayguzina, “Waveform design for wireless power transfer,” IEEE Trans. Signal Process., vol. 64, no. 23. pp. 6313–6328, Dec. 2016. 25 F. Liu, P. Shu, H. Jin, L. Ding, J. Yu, D. Niu, and B. Li, “Gearing resource-poor mobile devices with powerful clouds: Architectures, chal- lenges, and applications,” IEEE Wireless Commun., vol. 20, no. 3, pp. 14–22, Jul. 2013. 26 J. Liu, Y. Mao, J. Zhang, and K. B. Letaief, “Delay-optimal computation task scheduling for mobile-edge computing systems,” in Proc. IEEE ISIT, Barcelona, Spain, Jun. 2016, pp. 1451–1455. 27 D. Huang, P. Wang, and D. Niyato, “A dynamic offloading algorithm for mobile computing,” IEEE Trans. Wireless Commun., vol. 11, no. 6, pp. 1991–1995, Jun. 2012. Jie Xu (S’12-M’13) received his B.E. and Ph.D. 28 O. Muñoz, A. Pascual-Iserte, and J. Vidal, “Optimization of radio and degrees from University of Science and Technology computational resources for energy efficiency in latency-constrained of China in 2007 and 2012, respectively. He is application offloading,” IEEE Trans. Veh. Tech., vol. 64, no. 10, pp. now with the School of Information Engineering, 4738–4755, Oct. 2015. Guangdong University of Technology, China. From 29 Y. Wang, M. Sheng, X. Wang, L. Wang, and J. Li, “Mobile-edge com- 2012 to 2014, he was a Research Fellow with the puting: Partial computation offloading using dynamic voltage scaling,” Department of Electrical and Computer Engineer- IEEE Trans. Commun., vol. 64, no. 10, pp. 4268–4282, Aug. 2016. ing, National University of Singapore. From 2015 30 M.-H. Chen, B. Liang, and M. Dong, “Joint offloading decision and to 2016, he was a Postdoctoral Research Fellow resource allocation for multi-user multi-task mobile cloud,” in Proc. with the Engineering Systems and Design Pillar, IEEE ICC, Kuala Lumpur, Malaysia, May 2016, pp. 1–6. Singapore University of Technology and Design. 31 X. Chen, L. Jiao, W. Li, and X. Fu, “Efficient multi-user computation His research interests include energy efficiency and energy harvesting in offloading for mobile-edge cloud computing,” IEEE/ACM Trans. Netw., wireless communications, wireless information and power transfer, wireless vol. 24, no. 5, pp. 2795–2808, Oct. 2016. security, UAV communications, and mobile edge computing. He is now 32 C. You, K. Huang, H. Chae, and B. Kim, “Energy-efficient resource al- an Editor for the IEEE WIRELESS COMMUNICATIONS LETTERS, an location for mobile-edge computation offloading,” IEEE Trans. Wireless Associate Editor for the IEEE ACCESS, and a Guest Editor for the IEEE Commun., vol. 16, no. 3, pp. 1397–1411, Mar. 2017. WIRELESS COMMUNICATIONS. He received the IEEE Signal Processing 33 S. Sardellitti, G. Scutari, and S. Barbarossa, “Joint optimization of radio Society Young Author Best Paper Award in 2017. and computational resources for multicell mobile-edge computing,” IEEE Trans. Signal Inf. Process. Netw., vol. 1, no. 2, pp. 89–103, Jun. 2015. 34 C. You, K. Huang, and H. Chae, “Energy efficient mobile cloud computing powered by wireless energy transfer,” IEEE J. Sel. Areas Commun., vol. 34, no. 5, pp. 1757–1770, May 2016. 35 T. D. Burd and R. W. Brodersen, “Processor design for portable systems,” Kluwer J. VLSI Signal Process. Syst., vol. 13, no. 2, pp. 203– 221, Aug. 1996. 36 R. Corless, G. Gonnet, D. Hare, D. Jeffrey, and D. Knuth, “On the Lambert W function,” Adv. Comput. Math., vol. 5, no. 1, pp. 329–359, Dec. 1996. 37 S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge, U.K.: Cambridge Univ. Press, Mar. 2004. Xin Wang (SM’09) received the B.Sc. and M.Sc. 38 S. Boyd, “Ellipsoid method,” Stanford University, California, USA. degrees from Fudan University, Shanghai, China, in Online. Available: http://stanford.edu/class/ee364b/lectures/ellipsoid_ 1997 and 2000, respectively, and the Ph.D. degree method_notes.pdf fromAuburnUniversity, Auburn, AL,USA,in2004, 39 M. Grant, S. Boyd, and Y. Ye, “CVX: Matlab software for disciplined all in electrical engineering. convex programming,” 2009. Online. Available: http://cvxr.com/cvx/ From September 2004 to August 2006, he was a Postdoctoral Research Associate with the Depart- ment of Electrical and Computer Engineering, Uni- versity of Minnesota, Minneapolis. In August 2006, hejoined theDepartment ofComputerandElectrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL, USA, as an Assistant Professor, and then an Associate ProfessorfromAugust2010. Heiscurrently aDistinguished Profes- sor with the Department of Communication Science and Engineering, Fudan University. His research interests include stochastic network optimization, energy-efficient communications, cross-layer design, and signal processing for communications. He served as an Associate Editor for the IEEE Signal Processing Letters. He currently serves as an Associate Editor for the IEEE Transactions on Signal Processing and as an Editor for the IEEE Transactions on Vehicular Technology.14 Shuguang Cui (S’99-M’05-SM’12-F’14) received his Ph.D in Electrical Engineering from Stanford University, California, USA, in 2005. Afterwards, he has been working as assistant, associate, and full professor in Electrical and Computer Engineering at the Univ. of Arizona and Texas A&MUniversity. He iscurrently aChildFamily EndowedChairProfessor in Electrical and Computer Engineering at the Univ. of California-Davis. His current research interests focus on data driven large-scale system control and resource management, large data set analysis, IoT system design, energy harvesting based communication system design, and cognitive network optimization. He was selected as the Thomson Reuters Highly Cited Researcher and listed in the Worlds’ Most Influential Scientific Minds by ScienceWatch in 2014. He was the recipient of the IEEE Signal Processing Society 2012 Best Paper Award. He has served as the general co-chair and TPC co-chairs for many IEEE conferences. He has also been serving as the area editor for IEEE Signal Processing Magazine, and associate editors for IEEE Transactions on Big Data, IEEE Transactions on Signal Processing, IEEE JSAC Series on Green Communications and Networking, and IEEE Transactions on Wireless Communications. He has been the elected member for IEEE Signal Processing Society SPCOM Technical Committee (2009∼2014) and the elected Chair for IEEE ComSoc Wireless Technical Committee (2017∼2018). He is a member of the Steering Committee for both IEEE Transactions on Big Data and IEEE Transactions on Cognitive Communications and Networking. He is also a member of the IEEE ComSoc Emerging Technology Committee. He was elected as an IEEE Fellow in 2013 and an IEEE ComSoc Distinguished Lecturer in 2014.