One reason for why capacity lower bounds are so useful is that they are accurate proxies for link-level performance with modern coding. But this fact, well known to information and coding theorists, is often contested by practitioners. I will discuss some possible reasons for that here.
The recipe is to compute the capacity bound, and depending on the code blocklength, add a dB or a few, to the required SNR. That gives the link performance prediction. The coding literature is full of empirical results, showing how far from capacity a code of a given block length is for the AWGN channel, and this gap is usually not extremely different for other channel models – although, one should always check this.
But there are three main caveats with this:
First, the capacity bound, or the “SINR” that it often contains, must be information-theoretically correct. A great deal of papers get this wrong. Emil explained in his blog post last week some common errors. The recommended approach is to map the channel onto one of the canonical cases in Figure 2.9 in Fundamentals of Massive MIMO, verify that the technical conditions are satisfied, and use the corresponding formula.
When computing expressions of the type E[log(1+”SINR”)], then the average should be taken over all quantities that are random within the duration of a codeword. Typically, this means averaging over the randomness incurred by the noise, channel estimation errors, and in many cases the small-scale fading. All other parameters must be kept fixed. Typically, user positions, path losses, shadow fading, scheduling and pilot assignments, are fixed, so the expectation is conditional on those. (Yet, the interference statistics may vary substantially, if other users are dropping in and out of the system.) This in turn means that many “drops” have to be generated, where these parameters are drawn at random, and then CDF curves with respect to that second level of randomness needs be computed (numerically).Think of the expectation E[log(1+”SINR”)] as a “link simulation”. Every codeword sees many independent noise realizations, and typically small-scale fading realizations, but the same realization of the user positions. Also, often, neat (and tight) closed-form bounds on E[log(1+”SINR”)] are available.
Care is advised when working with relatively short blocks (less than a few hundred bits) and at rates close to the constrained capacity with the foreseen modulation format. In this case, many of the “standard” capacity bounds become overoptimistic.As a rule of thumb, compare the capacity of an AWGN channel with the constrained capacity of the chosen modulation at the spectral efficiency of interest, and if the gap is small, the capacity bounds will be useful. If not, then reconsider the choice of modulation format! (See also homework problem 1.4.)
How far are the bounds from the actual capacity typically? Nobody knows, but there are good reasons to believe they are extremely close. Here (Figure 1) is a nice example that compares a decoder that uses the measured channel likelihood, instead of assuming a Gaussian (which is implied by the typical bounding techniques). From correspondence with one of the authors: “The dashed and solid lines are the lower bound obtained by Gaussianizing the interference, while the circles are the rate achievable by a decoder exploiting the non-Gaussianity of the interference, painfully computed through days-long Monte-Carlo. (This is not exactly the capacity, because the transmit signals here are Gaussian, so one could deviate from Gaussian signaling and possibly do slightly better — but the difference is imperceptible in all the experiments we’ve done.)”
Concerning Massive MIMO and its capacity bounds, I have met for a long time with arguments that these capacity formulas aren’t useful estimates of actual performance. But in fact, they are: In one simulation study we were less than one dB from the capacity bound by using QPSK and a standard LDPC code (albeit with fairly long blocks). This bound accounts for noise and channel estimation errors. Such examples are in Chapter 1 of Fundamentals of Massive MIMO, and also in the ten-myth paper:
(I wrote the simulation code, and can share it, in case anyone would want to reproduce the graphs.)
So in summary, while capacity bounds are sometimes done wrong; when done right they give pretty good estimates of actual link performance with modern coding.
Many radio resource allocation tasks are combinatorial in nature. It might be to associate a user equipment (UE) to a base station (BS) from a set of BSs, to select a set of time-frequency resources for transmission to a particular UE, or to assign pilot sequences to a set of users. The unfortunate thing with discrete combinatorial optimization problems is that the number of combinations grows very rapidly with the number of UEs and the number of discrete options that can be made for each of them. For example, suppose there are K UEs and you have to pick one out of D options for each of them, then there are DK different combinations. Hence, the worst-case computational complexity grows exponentially with K.
Interestingly, some radio resource allocation problems that appear to have exponential complexity can be relaxed to a form that is much easier to solve – this is what I call “relax and conquer”. In optimization theory, relaxation means that you widen the set of permissible solutions to the problem, which in this context means that the discrete optimization variables are replaced with continuous optimization variables. In many cases, it is easier to solve optimization problems with variables that take values in continuous sets than problems with a mix of continuous and discrete variables.
A basic example of this principle arises when communicating over a single-user MIMO channel. To maximize the achievable rate, you first need to select how many data streams to spatially multiplex and then determine the precoding and power allocation for these data streams. This appears to be a mixed-integer optimization problem, but Telatar showed in his seminal paper that it can be solved by the water-filling algorithm. More precisely, you relax the problem by assuming that the maximum number of data streams are transmitted and then you let the solution to a convex optimization problem determine how many of the data streams that are assigned non-zero power; this is the optimal number of data streams. Despite the relaxation, the global optimum to the original problem is obtained.
There are other, less known examples of the “relax and conquer” method. Some years ago, I came across the paper “Jointly optimal downlink beamforming and base station assignment“, which has received much less attention than it deserves. The UE-BS association problem, considered in this paper, is non-trivial since some BSs might have many more UEs in their vicinity than other BSs. Nevertheless, the paper shows that one can solve the problem by first relaxing it so that all BSs transmit to all the UEs. The author formulates a relaxed optimization problem where the beamforming vectors (including power allocation) are selected to satisfy each UEs’ SINR constraint, while minimizing the total transmit power. This problem is solved by convex optimization and, importantly, the optimal solution is always such that each UE only receives a non-zero signal power from one of the BSs. Hence, the seemingly difficult combinatorial UE-BS association problem is relaxed to a convex optimization problem, which provides the optimal solution to the original problem!
I have reused this idea in several papers. The first example is “Massive MIMO and Small Cells: Improving Energy Efficiency by Optimal Soft-cell Coordination“, which considers a similar setup but with a maximum transmit power per BS. The consequence of including this practical constraint is that it might happen that some UEs are served by multiple BSs at the optimal solution. These BSs send different messages to the UE, which decode them by successive interference cancelation, thus the solution is still practically achievable.
One practical weakness with the two aforementioned papers is that they take small-scale fading realizations into account in the optimization, thus the problem must be solved once per coherence interval, requiring extremely high computational power. More recently, in the paper “Joint Power Allocation and User Association Optimization for Massive MIMO Systems“, we applied the same “relax and conquer” method to Massive MIMO, but targeting lower bounds on the downlink ergodic capacity. Since the capacity bounds are valid as long as the channel statistics are fixed (and the same UEs are active), our optimized BS-UE association can be utilized for a relatively long time period. This makes the proposed algorithm practically relevant, in contrast to the prior works that are more of academic interest.
Another example of the “relax and conquer” method is found in the paper “Joint Pilot Design and Uplink Power Allocation in Multi-Cell Massive MIMO Systems”. We consider the assignment of orthogonal pilot sequences to users, which appears to be a combinatorial problem. Instead of assigning a pilot sequence to each UE and then allocate power, we relax the problem by allowing each user to design its own pilot sequence, which is a linear combination of the original orthogonal sequences. Hence, a pair of UEs might have partially overlapping sequences, instead of either identical or orthogonal sequences (as in the original problem). The relaxed problem even allows for pilot contamination within a cell. The sequences are then optimized to maximize the max-min performance. The resulting problem is non-convex, but the combinatorial structure has been relaxed so that there are only optimization variables from continuous sets. A local optimum to the joint pilot assignment and power control problem is found with polynomial complexity, using standard methods from the optimization literature. The optimization might not lead to a set of orthogonal pilot sequences, but the solution is practically implementable and gives better performance.
We are used to measuring performance in terms of the signal-to-interference-and-noise ratio (SINR), but this is seldom the actual performance metric in communication systems. In practice, we might be interested in a function of the SINR, such as the data rate (a.k.a. spectral efficiency), bit-error-rate, or mean-squared error in the data detection. When the receiver has perfect channel state information (CSI), the aforementioned metrics are all functions of the same SINR expression, where the power of the received signal is divided by the power of the interference plus noise. Details can be found in Examples 1.6-1.8 of the book Optimal Resource Allocation in Coordinated Multi-Cell Systems.
In most cases, the receiver only has imperfect CSI and then it is harder to measure the performance. In fact, it took me years to understand this properly. To explain the complications, consider the uplink of a single-cell Massive MIMO system with single-antenna users and antennas at the base station. The received -dimensional signal is
where is the unit-power information signal from user , is the fading channel from this user, and is unit-power additive Gaussian noise. In general, the base station will only have access to an imperfect estimate of , for
Suppose the base station uses to select a receive combining vector for user . The base station then multiplies it with to form a scalar that is supposed to resemble the information signal :
From this expression, a common mistake is to directly say that the SINR is
which is obtained by computing the power of each of the terms (averaged over the signal and noise), and then claim that is an achievable rate (where the expectation is with respect to the random channels). You can find this type of arguments in many papers, without proof of the information-theoretic achievability of this rate value. Clearly, is an SINR, in the sense that the numerator contains the total signal power and the denominator contains the interference power plus noise power. However, this doesn’t mean that you can plug into “Shannon’s capacity formula” and get something sensible. This will only yield a correct result when the receiver has perfect CSI.
A basic (but non-conclusive) test of the correctness of a rate expression is to check that the receiver can compute the expression based on its available information (i.e., estimates of random variables and deterministic quantities). Any expression containing fails this basic test since you need to know the exact channel realizations to compute it, although the receiver only has access to the estimates.
What is the right approach?
Remember that the SINR is not important by itself, but we should start from the performance metric of interest and then we might eventually interpret a part of the expression as an effective SINR. In Massive MIMO, we are usually interested in the ergodic capacity. Since the exact capacity is unknown, we look for rigorous lower bounds on the capacity. There are several bounding techniques to choose between, whereof I will describe the two most common ones.
The first lower bound on the uplink capacity can be applied when the channels are Gaussian distributed and are the MMSE estimates with the corresponding estimation error covariance matrices . The ergodic capacity of user is then lower bounded by
Note that this expression can be computed at the receiver using only the available channel estimates (and deterministic quantities). The ratio inside the logarithm can be interpreted as an effective SINR, in the sense that the rate is equivalent to that of a fading channel where the receiver has perfect CSI and an SNR equal to this effective SINR. A key difference from is that only the part of the desired signal that is received along the estimated channel appears in the numerator of the SINR, while the rest of the desired signal appears as in the denominator. This is the price to pay for having imperfect CSI at the receiver, according to this capacity bound, which has been used by Hoydis et al. and Ngo et al., among others.
The second lower bound on the uplink capacity is
which can be applied for any channel fading distribution. This bound provides a value close to when there is substantial channel hardening in the system, while will greatly underestimate the capacity when varies a lot between channel realizations. The reason is that to obtain this bound, the receiver detects the signal as if it is received over a non-fading channel with gain (which is deterministic and thus known in theory, and easy to measure in practice), but there are no approximations involved so is always a valid bound.
Since all the terms in are deterministic, the receiver can clearly compute it using its available information. The main merit of is that the expectations in the numerator and denominator can sometimes be computed in closed form; for example, when using maximum-ratio and zero-forcing combining with i.i.d. Rayleigh fading channels or maximum-ratio combining with correlated Rayleigh fading. Two early works that used this bound are by Marzetta and by Jose et al..
The two uplink rate expressions can be proved using capacity bounding techniques that have been floating around in the literature for more than a decade; the main principle for computing capacity bounds for the case when the receiver has imperfect CSI is found in a paper by Medard from 2000. The first concise description of both bounds (including all the necessary conditions for using them) is found in Fundamentals of Massive MIMO. The expressions that are presented above can be found in Section 4 of the new book Massive MIMO Networks. In these two books, you can also find the right ways to compute rigorous lower bounds on the downlink capacity in Massive MIMO.
In conclusion, to avoid mistakes, always start with rigorously computing the performance metric of interest. If you are interested in the ergodic capacity, then you start from one of the canonical capacity bounds in the above-mentioned books and verify that all the required conditions are satisfied. Then you may interpret part of the expression as an SINR.
I never thought it would happen so fast. When I started to work on Massive MIMO in 2009, the general view was that fully digital, phase-coherent operation of so many antennas would be infeasible, and that power consumption of digital and analog circuitry would prohibit implementations for the foreseeable future. More seriously, reservations were voiced that reciprocity-based beamforming would not work, or that operation in mobile conditions would be impossible.
These arguments, it turned out, all proved to be wrong. In 2017, Massive MIMO was the main physical-layer technology under standardization for 5G, and it is unlikely that any serious future cellular wireless communications system would not have Massive MIMO as a main technology component.
But Massive MIMO is more than a groundbreaking technology for wireless communications: it is also an elegant and mathematically rigorous approach to teaching wireless communications. In the moderately-large number-of-antennas regime, our closed-form capacity bounds become convenient proxies for the link performance achievable with practical coding and modulation.
These expressions take into account the effects of all significant physical phenomena: small-scale and large-scale fading, intra- and inter-cell interference, channel estimation errors, pilot reuse (also known as pilot contamination) and power control. A comprehensive analytical understanding of these phenomena simply has not been possible before, as the corresponding information theory has too complicated for any practical use.
The intended audiences of Fundamentals of Massive MIMO are engineers and students. I anticipate that as graduate courses on the topic become commonplace, our extensive problem set (with solutions) available online will serve as a useful resource to instructors. While other books and monographs will likely appear down the road, focusing on trendier and more recent research, Fundamentals of Massive MIMO distills the theory and facts that will prevail for the foreseeable future. This, I hope, will become its most lasting impact.
To read the preface of Fundamentals of Massive MIMO, click here. You can also purchase the book here.
On January 17, I will give a 1-hour webinar in the IEEE 5G Webinar Series. I was asked to talk about “Massive MIMO for 5G below 6 GHz” since there has been a lot of focus on mmWave frequencies in the 5G discussions, although the primary band for 5G seems to be in the range 3.4-3.8 GHz, according to Ericsson.
Our recent guest post about the combination of Massive MIMO and drones has received a lot of interest on social media. The use of unmanned aerial vehicles (UAVs) for wireless communications is certainly an emerging topic that deserves further attention!
While the previous blog post focused on Massive MIMO aspects of UAV communications, other theoretical research findings are reviewed in this tutorial by Walid Saad and Mehdi Bennis:
Furthermore, the team of the ERC Advanced PERFUME project, lead by Prof. David Gesbert, has recently demonstrated what appears to be the world’s first autonomous flying base station relays. This exciting achievement is demonstrated in the following video:
Recently, there has been a hype on the use of drones (also called unmanned aerial vehicles (UAVs)) for civilian and military applications. Especially, in the coming decades, lightweight miniature drones are expected to play a major role in the society. Nowadays, small drones are available in toy shops so that an individual could buy it for personal uses such as aerial videography. However, due to security reasons, the personal use of drones is limited to low altitudes (up to 120 m in most countries) and visible line-of-sight. On the other hand, it is most likely that, in many countries, government agencies and commercial firms will be allowed to use drones for a variety of services (See: link 1 and link 2.)
There are many foreseen applications that involve a large number of drones in a limited area such as disaster management, traffic monitoring, crowd management, and crop monitoring. The major communication requirements of most of the drone networks are: several tens of Mbps throughput for streaming high-resolution video, low latency for command and control, highly reliable connectivity in a three-dimensional coverage area, high-mobility support, and simultaneous support for a large number of drones.
The existing wireless systems are unsuitable for communicating with a large number of drones in long-range, high throughput, and high-altitude applications for the following reasons:
In many drone communication scenarios, the mobility and traffic patterns of drones are different from the ground users. For example, in aerial surveillance applications, the uplink traffic is much higher than the downlink traffic. Depending on the application, the drones will fly at high speed (10-50 m/s) in a 3D space.
The propagation environment in drone communication scenarios will be line-of-sight, even under high mobility.
The terrestrial wireless communication networks are optimized for indoor, short range, low mobility (e.g. WiFi), and low altitude (e.g. LTE).
In LTE, since the base station antennas are tilted towards the ground, coverage is possible only if the drones fly below 100 m altitude. Apart from coverage, the co-channel interference generated from the neighboring cells will be a major problem in satisfying the high throughput requirements of drones.
The MAC layer protocols of the existing systems have to be redesigned according to the drones’ requirements, especially regarding the re-transmission protocols which are related to latency and crucial for drone control.
Since the existing wireless systems are connected to the power grids, they might not be available during emergency situations such as earth-quake, massive flooding, and tsunami. Further, in mountain and sea environments, cellular networks are not widely available. This problem can be overcome by deploying flying UAV base stations over the sky.
For the above-mentioned reasons, instead of borrowing from existing wireless technologies, it would be better to develop a new technology, considering the specific drone networks’ requirements and propagation characteristics. As of now, spectrum allocation and standardization efforts for drone communication networks are in the initial stage of development. This is where Massive MIMO can play a key role. The attractive features of Massive MIMO, such as spatial multiplexing and range extension, can be exploited to design flexible and efficient drone communication systems. 5G is based on the concept of network slicing, where the network can be configured differently depending on the use case. Therefore, it is possible to deploy a variation of 5G for drone communications along with appropriately tilted antenna arrays to provide connectivity to the drones flying at high altitudes.
In our recent papers (1 and 2), we illustrated the use for Massive MIMO for drone communications. From these papers, we make the following observations:
The Massive MIMO performance in rich scattering is well understood by the use of ergodic rate bounds that are available in closed form. In line-of-sight, the ergodic rate performance depends on the relative positions of the drones as they move very quickly in 3D space. Interestingly, in case of line-of-sight, the uplink ergodic rate bounds (with MRC receiver) are available in closed form for some specific cases, for example, for the uniformly distributed drone positions within a spherical volume. However, more work is needed to understand the ergodic rate performance with arbitrary drone distributions.
The element-spacing in the ground station array affects the rate performance depending on the distribution of the drones. For a given distribution of the drone positions, ground station array has to be optimized to maximize the ergodic rate.
The probability of outage due to polarization mismatch can be made negligible by appropriately selecting the orientation and polarization of the individual array elements. For example, circularly polarized cross-dipole antenna elements perform much better when compared to linearly polarized dipoles. (For more details, see this paper.) This means that the use of simple antenna elements, such as cross-dipoles, reduce the concerns of
antenna pattern designs. Further, the drones can be equipped with a single cross-dipole.
The range extension due to the increased number of antennas can eliminate the need for multi-hop solutions in many drone communication scenarios.
TDD based Massive MIMO can be used for simultaneously supporting several tens of drones both at μ-wave and mm-wave frequencies.
TDD based Massive MIMO can support high-mobility drone communications. In some scenarios (e.g., deterministic trajectories), the channel can be extrapolated without sending pilot symbols.
Below are some examples of use cases of Massive MIMO enabled drone communication systems. The technical details of Massive MIMO based system design can be found in this paper. The Massive MIMO design parameters for some of the use cases can be found in this paper.
Drone racing: In recent years, drone racing, also called “the sport of the future”, is becoming popular around the world. In drone racing, low latency is important for drone control, because even a few tens of milliseconds delay might crash the drone when it moves at the speed of 40-50 m/s. Interestingly, in our digital world, analog transmission is used for sending videos from racing drones to the pilots. The reason is that, unlike digital transmission, an analog transmission does not incur any processing delay and the overall latency is about only 15 ms. Currently, the 5.8 GHz band (5650 MHz to 5925 MHz) is used for drone racing. The transmitter and receiver use frequency modulation and it requires 40 MHz frequency separation to avoid cross-talks between neighboring channels. As a result, the number of simultaneous drones in a contest is limited to eight. The video quality is also poor. By using Massive MIMO, several tens of drones can simultaneously participate in a contest and the pilots can enjoy latency-free high-quality video transmission.
Sports streaming: Utilizing drones for sports streaming will change the way we view the sports events. High resolution 4K 360-degree videos taken by multiple drones at different angles can be broadcasted to enable the viewers to have an entirely a new experience. If there are 20 drones covering a sports event, the required sum throughput will be in the order of 10 Gbps. Massive MIMO in the mm-wave frequency band can be used to achieve this high throughput. This can become reality as already there are signs towards the use of drones for covering sports events. For instance, during the 2018 Winter Olympics, drones will be extensively used.
Surveillance/ Search and Rescue/Disaster management: During natural disasters, a network of drones can be quickly deployed to enable the rescue teams to assess the situation in real-time via high-resolution video streaming. Depending on the area to be covered and desired video quality, the sum throughput requirement will be in the order of Gbps. A Massive MIMO array deployed over a ground vehicle or a large aerial vehicle can be used for serving a swarm of drones.
Aerial survey: A swarm of drones can be used for high-resolution aerial imagery of several kilometers of landscape. There are many uses of aerial survey, including state governance, city planning, 3D cartography, and crop monitoring. Massive MIMO can be an enabler for such high throughput and long-range applications.
Backhaul for flying base stations: During emergency situations and heavy traffic conditions, UAVs could be used as flying base stations to provide wireless connectivity to the cellular users. A Massive MIMO base station can act as a high-capacity backhaul to a large number of flying base stations.
Space exploration: Currently, it takes several hours to receive a photo taken by the Curiosity Mars rover. It is possible to use Massive MIMO to reduce the overall transmission delay. For example, by using a massive antenna array deployed in an orbiter (see the above figure), a swarm of drones and rovers roaming on the surface of another planet can send videos and images to earth. The array can be used to spatially multiplex the uplink transmission from the drones (and possibly the rovers) to the orbiter. Note that the distance between the Mars surface and the orbiter is about 400 km. If the drones fly at an altitude of a few hundred meters and spread out over the region with a few hundred kilometers of radius, the angular resolution of the array is sufficient for spatial multiplexing. The array can be used to transmit the collected images and videos to earth by exploiting the array gain. This might sound like a science fiction, but NASA is already developing a 256 element antenna array for future Mars rovers to enable direct communication with the earth.