There has been a long-standing debate on the relative performance between reciprocity-based (TDD) Massive MIMO and that of FDD solutions based on grid-of-beams, or hybrid beamforming architectures. The matter was, for example, the subject of a heated debate in the 2015 Globecom industry panel “Massive MIMO vs FD-MIMO: Defining the next generation of MIMO in 5G” where on the one hand, the commercial arguments for grid-of-beams solutions were clear, but on the other hand, their real potential for high-performance spatial multiplexing was strongly contested.
While it is known that grid-of-beams solutions perform poorly in isotropic scattering, no prior experimental results are known. This new paper:
answers this performance question through the analysis of real Massive MIMO channel measurement data obtained at the 2.6 GHz band. Except for in certain line-of-sight (LOS) environments, the original reciprocity-based TDD Massive MIMO represents the only effective implementation of Massive MIMO at the frequency bands under consideration.
What is more worth? 1 MHz bandwidth at 100 MHz carrier frequency, or 10 MHz bandwidth at 1 GHz carrier? Conventional wisdom has it that higher carrier frequencies are more valuable because “there is more bandwidth there”. In this post, I will explain why that is not entirely correct.
The basic presumption of TDD/reciprocity-based Massive MIMO is that all activity, comprising the transmission of uplink pilots, uplink data and downlink data, takes place inside of a coherence interval:
At fixed mobility, in meter/second, the dimensionality of the coherence interval is proportional to the wavelength, because the Doppler spread is proportional to the carrier frequency.
In a single cell, with max-min fairness power control (for uniform quality-of-service provision), the sum-throughput of Massive MIMO can be computed analytically and is given by the following formula:
In this formula,
= bandwidth in Hertz (split equally between uplink and downlink)
= number of base station antennas
= number of multiplexed terminals
= coherence bandwidth in Hertz (independent of carrier frequency)
= coherence time in seconds (inversely proportional to carrier frequency)
SNR = signal-to-noise ratio (“normalized transmit power”)
= path loss for the k:th terminal
= constant, close to with sufficient pilot power
This formula assumes independent Rayleigh fading, but the general conclusions remain under other models.
The factor that pre-multiplies the logarithm depends on .
The pre-log factor is maximized when . The maximal value is , which is proportional to , and therefore proportional to the wavelength. Due to the multiplication $B T_c$, one can get same pre-log factor using a smaller bandwidth by instead increasing the wavelength, i.e., reducing the carrier frequency. At the same time, assuming appropriate scaling of the number of antennas, , with the number of terminals, , the quantity inside of the logarithm is a constant.
Concluding, the sum spectral efficiency (in b/s/Hz) easily can double for every doubling of the wavelength: a megahertz of bandwidth at 100 MHz carrier is ten times more worth than a megahertz of bandwidth at a 1 GHz carrier. So while there is more bandwidth available at higher carriers, the potential multiplexing gains are correspondingly smaller.
In this example,
all three setups give the same sum-throughput, however, the throughput per terminal is vastly different.
The cellular network that my smartphone connects to normally delivers 10-40 Mbit/s. That is sufficient for video-streaming and other applications that I might use. Unfortunately, I sometimes have poor coverage and then I can barely download emails or make a phone call. That is why I think that providing ubiquitous data coverage is the most important goal for 5G cellular networks. It might also be the most challenging 5G goal, because the area coverage has been an open problem since the first generation of cellular technology.
It is the physics that make it difficult to provide good coverage. The transmitted signals spread out and only a tiny fraction of the transmitted power reaches the receive antenna (e.g., one part of a billion parts). In cellular networks, the received signal power reduces roughly as the propagation distance to the power of four. This results in the following data rate coverage behavior:
This figure considers an area covered by nine base stations, which are located at the middle of the nine peaks. Users that are close to one of the base stations receive the maximum downlink data rate, which in this case is 60 Mbit/s (e.g., spectral efficiency 6 bit/s/Hz over a 10 MHz channel). As a user moves away from a base station, the data rate drops rapidly. At the cell edge, where the user is equally distant from multiple base stations, the rate is nearly zero in this simulation. This is because the received signal power is low as compared to the receiver noise.
What can be done to improve the coverage?
One possibility is to increase the transmit power. This is mathematically equivalent to densifying the network, so that the area covered by each base station is smaller. The figure below shows what happens if we use 100 times more transmit power:
There are some visible differences as compared to Figure 1. First, the region around the base station that gives 60 Mbit/s is larger. Second, the data rates at the cell edge are slightly improved, but there are still large variations within the area. However, it is no longer the noise that limits the cell-edge rates—it is the interference from other base stations.
Ideally, we would like to increase only the power of the desired signals, while keeping the interference power fixed. This is what transmit precoding from a multi-antenna array can achieve; the transmitted signals from the multiple antennas at the base station add constructively only at the spatial location of the desired user. More precisely, the signal power is proportional to M (the number of antennas), while the interference power caused to other users is independent of M. The following figure shows the data rates when we go from 1 to 100 antennas:
Figure 3 shows that the data rates are increased for all users, but particularly for those at the cell edge. In this simulation, everyone is now guaranteed a minimum data rate of 30 Mbit/s, while 60 Mbit/s is delivered in a large fraction of the coverage area.
In practice, the propagation losses are not only distant-dependent, but also affected by other large-scale effects, such as shadowing. The properties described above remain nevertheless. Coherent precoding from a base station with many antennas can greatly improve the data rates for the cell edge users, since only the desired signal power (and not the interference power) is increased. Higher transmit power or smaller cells will only lead to an interference-limited regime where the cell-edge performance remains to be poor. A practical challenge with coherent precoding is that the base station needs to learn the user channels, but reciprocity-based Massive MIMO provides a scalable solution to that. That is why Massive MIMO is the key technology for delivering ubiquitous connectivity in 5G.
The main selling point of millimeter-wave communications is the abundant bandwidth available in such frequency bands; for example, 2 GHz of bandwidth instead of 20 MHz as in conventional cellular networks. The underlying argument is that the use of much wider bandwidths immediately leads to much higher capacities, in terms of bit/s, but the reality is not that simple.
To look into this, consider a communication system operating over a bandwidth of Hz. By assuming an additive white Gaussian noise channel, the capacity becomes
where W is the transmit power, is the channel gain, and W/Hz is the power spectral density of the noise. The term inside the logarithm is referred to as the signal-to-noise ratio (SNR).
Since the bandwidth appears in front of the logarithm, it might seem that the capacity grows linearly with the bandwidth. This is not the case since also the noise term in the SNR also grows linearly with the bandwidth. This fact is illustrated by Figure 1 below, where we consider a system that achieves an SNR of 0 dB at a reference bandwidth of 20 MHz. As we increase the bandwidth towards 2 GHz, the capacity grows only modestly. Despite the 100 times more bandwidth, the capacity only improves by , which is far from the that a linear increase would give.
The reason for this modest capacity growth is the fact that the SNR reduces inversely proportional to the bandwidth. One can show that
The convergence to this limit is seen in Figure 1 and is relatively fast since for .
To achieve a linear capacity growth, we need to keep the SNR fixed as the bandwidth increases. This can be achieved by increasing the transmit power proportionally to the bandwidth, which entails using more power when operating over a wider bandwidth. This might not be desirable in practice, at least not for battery-powered devices.
An alternative is to use beamforming to improve the channel gain. In a Massive MIMO system, the effective channel gain is , where is the number of antennas and is the gain of a single-antenna channel. Hence, we can increase the number of antennas proportionally to the bandwidth to keep the SNR fixed.
Figure 2 considers the same setup as in Figure 1, but now we also let either the transmit power or the number of antennas grow proportionally to the bandwidth. In both cases, we achieve a capacity that grows proportionally to the bandwidth, as we initially hoped for.
In conclusion, to make efficient use of more bandwidth we require more transmit power or more antennas at the transmitter and/or receiver. It is worth noting that these requirements are purely due to the increase in bandwidth. In addition, for any given bandwidth, the operation at millimeter-wave frequencies requires much more transmit power and/or more antennas (e.g., additional constant-gain antennas or one constant-aperture antenna) just to achieve the same SNR as in a system operating at conventional frequencies below 5 GHz.
One of the main impairments in wireless communications is small-scale channel fading. This refers to random fluctuations in the channel gain, which are caused by microscopic changes in the propagation environments. The fluctuations make the channel unreliable, since occasionally the channel gain is very small and the transmitted data is then received in error.
The diversity achieved by sending a signal over multiple channels with independent realizations is key to combating small-scale fading. Spatial diversity is particularly attractive, since it can be obtained by simply having multiple antennas at the transmitter or the receiver. Suppose the probability of a bad channel gain realization is p. If we have M antennas with independent channel gains, then the risk that all of them are bad is pM. For example, with p=0.1, there is a 10% risk of getting a bad channel in a single-antenna system and a 0.000001% risk in an 8-antenna system. This shows that just a few antennas can be sufficient to greatly improve reliability.
In Massive MIMO systems, with a “massive” number of antennas at the base station, the spatial diversity also leads to something called “channel hardening”. This terminology was used already in a paper from 2004:
In short, channel hardening means that a fading channel behaves as if it was a non-fading channel. The randomness is still there but its impact on the communication is negligible. In the 2004 paper, the hardening is measured by dividing the instantaneous supported data rate with the fading-averaged data rate. If the relative fluctuations are small, then the channel has hardened.
Since Massive MIMO systems contain random interference, it is usually the hardening of the channel that the desired signal propagates over that is studied. If the channel is described by a random M-dimensional vector h, then the ratio ||h||2/E{||h||2} between the instantaneous channel gain and its average is considered. If the fluctuations of the ratio are small, then there is channel hardening. With an independent Rayleigh fading channel, the variance of the ratio reduces with the number of antennas as 1/M. The intuition is that the channel fluctuations average out over the antennas. A detailed analysis is available in a recent paper.
The figure above shows how the variance of ||h||2/E{||h||2} decays with the number of antennas. The convergence towards zero is gradual and so is the channel hardening effect. I personally think that you need at least M=50 to truly benefit from channel hardening.
Many misconceptions float around about the pilot contamination phenomenon. While existent in any multi-cellular system, its effect tends to be particularly pronounced in Massive MIMO due to the presence of coherent interference, that scales proportionally to the coherent beamforming gain. (Chapter 4 in Fundamentals of Massive MIMO gives the details.)
A good system design definitely must not ignore pilot interference. While it is easily removed “on the average” through greater-than-one reuse, the randomness present in wireless communications – especially the shadow fading – will occasionally cause a few terminals to be severely hit by pilot contamination and bring down their performance. This is problematic whenever we are concerned about the provision of uniformly great service in the cell – and that is one of the principal selling arguments for Massive MIMO. Notwithstanding, the impact of pilot contamination can be reduced significantly in practice by appropriate pilot reuse and judicious power control. (Chapters 5-6 in Fundamentals of Massive MIMO gives many details.)
A more fundamental question is whether pilot contamination could be entirely overcome: Does there exist an upper bound on capacity that saturates as the number of antennas, M, is increased indefinitely? Some have speculated that it cannot; much in line with known capacity upper bounds for cellular base station cooperation. While this question may be of more academic than practical interest, it has long been open except for in some trivial special cases: If the channels of two terminals lie in non-overlapping subspaces and Bayesian channel estimation is used, the channel estimates will not be contaminated; capacity grows as log(M) when M increases without bound.
A much deeper result is established in this recent paper: the subspaces of the channel covariances may overlap, yet capacity grows as log(M). Technically, a Rayleigh fading with spatial correlation is assumed, and the correlation matrices for the contaminating terminals must only be linearly independent as M goes to infinity (exact conditions in the paper). In retrospect, this is not unreasonable given the substantial a priori knowledge exploited by the Bayesian channel estimator, but I found it amazing how weak the required conditions on the correlation matrices are. It remains unclear whether the result generalizes to the case of a growing number of interferers: letting the number of antennas go to infinity and then growing the network is not the same thing as taking an “infinite” (scalable) network and increasing the number of antennas. But this paper elegantly and rigorously answers a long-standing question that has been the subject of much debate in the community – and is a recommended read for anyone interested in the fundamental limits of Massive MIMO.
One word that is tightly connected with Massive MIMO is pilot contamination. This is a phenomenon that can appear in any communication system that operates under interference, but in this post, I will describe its basic properties in Massive MIMO.
The base station wants to know the channel responses of its user terminals and these are estimated in the uplink by sending pilot signals. Each pilot signal is corrupted by inter-cell interference and noise when received at the base station. For example, consider the scenario illustrated below where two terminals are transmitting simultaneously, so that the base station receives a superposition of their signals—that is, the desired pilot signal is contaminated.
When estimating the channel from the desired terminal, the base station cannot easily separate the signals from the two terminals. This has two key implications:
First, the interfering signal acts as colored noise that reduces the channel estimation accuracy.
Second, the base station unintentionally estimates a superposition of the channel from the desired terminal and from the interferer. Later, the desired terminal sends payload data and the base station wishes to coherently combine the received signal, using the channel estimate. It will then unintentionally and coherently combine part of the interfering signal as well. This is particularly poisonous when the base station has M antennas, since the array gain from the receive combining increases both the signal power and the interference power proportionally to M. Similarly, when the base station transmits a beamformed downlink signal towards its terminal, it will unintentionally direct some of the signal towards to interferer. This is illustrated below.
In the academic literature, pilot contamination is often studied under the assumption that the interfering terminal sends the same pilot signal as the desired terminal, but in practice any non-orthogonal interfering signal will cause the two effects described above.