Ever since I finished the writing of the book Massive MIMO Networks: Spectral, Energy, and Hardware Efficiency, I have felt that I’m somewhat done with my research on conventional Massive MIMO. The spectral efficiency, energy efficiency, resource allocation, and pilot contamination phenomenon are well understood by now. This is not a bad thing—as researchers, we are supposed to solve the problems we are analyzing. But it means that this is a good time to look for new research directions. It should preferably be something where we can utilize our skills as Massive MIMO researchers to do something new and exciting!
With this in mind, I gathered a team consisting of myself, Luca Sanguinetti, Henk Wymeersch, Jakob Hoydis, and Thomas L. Marzetta. Each one of us has written about one promising new direction of research related to antenna arrays and MIMO, including the background of the topic, our long-term vision, and pertinent open problem. This resulted in the paper:
Massive MIMO is a Reality – What is Next? Five Promising Research Directions for Antenna Arrays
You can find the preprint on arXiv.org or by clicking on the name of the paper. I hope that you will find it as interesting to read as it was for us to write!
]]>She gives a webinar on state-of-the-art circuit implementations of Massive MIMO, and outlines future research challenges. The webinar is based on, among others, this paper.
In more detail the webinar will summarize the fundamental technical contributions to efficient digital signal processing for Massive MIMO. The opportunities and constraints on operating on low-complexity RF and analog hardware chains are clarified. It will explain how terminals can benefit from improved energy efficiency. The status of technology and real-life prototypes will be discussed. Open challenges and directions for future research are suggested.
Listen to the webinar by following this link.
]]>I haven’t seen a list price, and I don’t know how much exotic metals and what licensing costs that its manufacture requires, but let’s ponder the possibility that a CSAC could be manufactured for the mass-market for a few dollars each. What new applications would then become viable in wireless?
The answer is mostly (or entirely) speculation. One potential application that might become more practical is positioning using distributed arrays. Another is distributed multipair relaying. Here and here are some specific ideas that are communication-theoretically beautiful, and probably powerful, but that seem to be perceived as unrealistic because of synchronization requirements. Perhaps CoMP and distributed MIMO, a.k.a. “cell-free Massive MIMO”, applications could also benefit.
Other applications might be applications for example in IoT, where a device only sporadically transmits information and wants to stay synchronized (perhaps there is no downlink, hence no way of reliably obtaining synchronization information). If a timing offset (or frequency offset for that matter) is unknown but constant over a very long time, it may be treated as a deterministic unknown and estimated. The difficulty with unknown time and frequency offsets is not their existence per se, but the fact that they change quickly over time.
It’s often said (and true) that the “low” speed of light is the main limiting factor in wireless. (Because channel state information is the main limiting factor of wireless communications. If light were faster, then channel coherence would be longer, so acquiring channel state information would be easier.) But maybe the unavailability of a ubiquitous, reliable time reference is another, almost as important, limiting factor. Can CSAC technology change that? I don’t know, but perhaps we ought to take a closer look.
]]>When an antenna array is used to focus a transmitted signal on a receiver, we call this beamforming (or precoding) and we usually illustrate it as shown to the right. This cartoonish illustration is only applicable when the antennas are gathered in a compact array and there is a line-of-sight channel to the receiver.
If we want to deploy very many antennas, as in Massive MIMO, it might be preferable to distribute the antennas over a larger area. One such deployment concept is called Cell-free Massive MIMO. The basic idea is to have many distributed antennas that are transmitting phase-coherently to the receiving user. In other words, the antennas’ signal components add constructively at the location of the user, just as when using a compact array for beamforming. It is therefore convenient to call it beamforming in both cases—algorithmically it is the same thing!
The question is: How can we illustrate the beamforming effect when using a distributed array?
The figure below shows how to do it. I consider a toy example with 80 star-marked antennas deployed along the sides of a square and these antennas are transmitting sinusoids with equal power, but different phases. The phases are selected to make the 80 sine-components phase-aligned at one particular point in space (where the receiving user is supposed to be):
Clearly, the “beamforming” from a distributed array does not give rise to a concentrated signal beam, but the signal amplification is confined to a small spatial region (where the color is blue and the values on the vertical axis are close to one). This is where the signal components from all the antennas are coherently combined. There are minor fluctuations in channel gain at other places, but the general trend is that the components are non-coherently combined everywhere except at the receiving user. (Roughly the same will happen in a rich multipath channel, even if a compact array is used for transmission.)
By looking at a two-dimensional version of the figure (see below), we can see that the coherent combination occurs in a circular region that is roughly half a wavelength in diameter. At the carrier frequencies used for cellular networks, this region will only be a few centimeters or millimeters wide. It is almost magical how this distributed array can amplify the signal at such a tiny spatial region! This spatial region is probably what the company Artemis is calling a personal cell (pCell) when marketing their distributed MIMO solution.
If you are into the details, you might wonder why I simulated a square region that is only a few wavelengths wide, and why the antenna spacing is only a quarter of a wavelength. This assumption was only made for illustrative purposes. If the physical antenna locations are fixed but we would reduce the wavelength, the size of the circular region will reduce and the ripples will be more frequent. Hence, we would need to compute the channel gain at many more spatial sample points to produce a smooth plot.
Reproduce the results: The code that was used to produce the plots can be downloaded from my GitHub.
]]>I think it is fair to say that no fundamentally new beamforming methods have been developed in the Massive MIMO literature, but we have rather taken known methods and generalized them to take imperfect channel state information and other practical aspects into account. And then we have developed rigorous ways to quantify the achievable rates that these beamforming methods achieve and studied the asymptotic behaviors when having many antennas. Closed-form expressions are available in some special cases, while Monte Carlo simulations can be used to compute these expressions in other cases.
As beamforming has evolved from an analog phased-array concept, where angular beams are studied, to a digital concept where the beamforming is represented in multi-dimensional vector spaces, it easy to forget the basic properties of array processing. That is why we dedicated Section 7.4 in Massive MIMO Networks to describe how the physical beam width and spatial resolution depend on the array geometry.
In particular, I’ve observed a lot of confusion about the dimensionality of MIMO arrays, which are probably rooted in the confusion around the difference between an antenna (which is something connected to an RF chain) and a radiating element. I explained this in detail in a previous blog post and then exemplified it based on a recent press release. I have also recorded the following video to visually explain these basic properties:
A recent white paper from Ericsson is also providing a good description of these concepts, particularly focused on how an array with a given geometry can be implemented with different numbers of RF chains (i.e., different numbers of antennas) depending on the deployment scenario. While having as many antennas as radiating element is preferable from a performance perspective, but the Ericsson researchers are arguing that one can get away with fewer antennas in the vertical direction in deployments where it is anyway very hard to resolve users in the elevation dimension.
]]>The tedious, time-consuming, and buggy nature of system-level simulations is exacerbated with massive MIMO. This post offers some relieve in the form of analytical expressions for downlink conjugate beamforming [1]. These expressions enable the testing and calibration of simulators—say to determine how many cells are needed to represent an infinitely large network with some desired accuracy. The trick that makes the analysis feasible is to let the shadowing grow strong, yet the ensuing expressions capture very well the behaviors with practical shadowings.
The setting is an infinitely large cellular network where each -antenna base station (BS) serves single-antenna users. The large-scale channel gains include pathloss with exponent and shadowing having log-scale standard deviation , with the gain between the th BS and the th user served by a BS of interest denoted by . With conjugate beamforming and receivers reliant on channel hardening, the signal-to-interference ratio (SIR) at such user is [2]
where is the gain from the serving BS and is the share of that BS’s power allocated to user . Two power allocations can be analyzed:
The analysis is conducted for , which makes it valid for arbitrary BS locations.
For notational compactness, let . Define as the solution to where is the lower incomplete gamma function. For , in particular, . Under a uniform power allocation, the CDF of is available in an explicit form involving the Gauss hypergeometric function (available in MATLAB and Mathematica):
where “” indicates asymptotic () equality, is such that the CDF is continuous, and
Alternatively, the CDF can be obtained by solving (e.g., with Mathematica) a single integral involving the Kummer function :
This latter solution can be modified for the SIR-equalizing power allocation as
The spectral efficiency of user is with CDF readily characterizable from the expressions given earlier. From , the sum spectral efficiency at the BS of interest can be found as Expressions for the averages and are further available in the form of single integrals.
With a uniform power allocation,
(1)
and . For the special case of , the Kummer function simplifies giving
(2)
With an equal-SIR power allocation
(3)
and .
Let us now contrast the analytical expressions (computable instantaneously and exactly, and valid for arbitrary topologies, but asymptotic in the shadowing strength) with some Monte-Carlo simulations (lengthy, noisy, and bug-prone, but for precise shadowing strengths and topologies).
First, we simulate a 500-cell hexagonal lattice with , and . Figs. 1a-1b compare the simulations for – dB with the analysis. The behaviors with these typical outdoor values of are well represented by the analysis and, as it turns out, in rigidly homogeneous networks such as this one is where the gap is largest.
For a more irregular deployment, let us next consider a network whose BSs are uniformly distributed. BSs (500 on average) are dropped around a central one of interest. For each network snapshot, users are then uniformly dropped until of them are served by the central BS. As before, , and . Figs. 2a-2b compare the simulations for dB with the analysis, and the agreement is now complete. The simulated average spectral efficiency with a uniform power allocation is b/s/Hz/user while (2) gives b/s/Hz/user.
The analysis presented in this post is not without limitations, chiefly the absence of noise and pilot contamination. However, as argued in [1], there is a broad operating range (– with very conservative premises) where these effects are rather minor, and the analysis is hence applicable.
[1] G. George, A. Lozano, M. Haenggi, “Massive MIMO forward link analysis for cellular networks,” arXiv:1811.00110, 2018.
[2] T. Marzetta, E. Larsson, H. Yang, and H. Ngo, Fundamentals of Massive MIMO. Cambridge University Press, 2016.
[3] H. Yang and T. L. Marzetta, “A macro cellular wireless network with uniformly high user throughputs,” IEEE Veh. Techn. Conf. (VTC’14), Sep. 2014.
Consider the communication link between a single-antenna user and an -antenna base station (BS). The channel vector varies over time and frequency in a way that is often modeled as random fading. In each channel coherence blocks, the BS selects a precoding vector and uses it for downlink transmission. The precoding reduces the multiantenna vector channel to an effective single-antenna scalar channel
The receiving user does not need to know the full -dimensional vectors and . However, to decode the downlink data in a successful way, it needs to learn the complex scalar channel . The difficulty in learning depends strongly on the mechanism of precoding selection. Two examples are considered below.
Codebook-based precoding
In this case, the BS tries out a set of different precoding vectors from a codebook (e.g., a grid of beams, as shown to the right) by sending one downlink pilot signal through each one of them. The user measures for each one of them and feeds back the index of the one that maximizes the channel gain . The BS will then transmit data using that precoding vector. During the data transmission, can have any phase, but the user already knows the phase and can compensate for it in the decoding algorithm.
If multiple users are spatially multiplexed in the downlink, the BS might use another precoding vector than the one selected by the user. For example, regularized zero-forcing might be used to suppress interference. In that case, the magnitude of the channel changes, but the phase remains the same. If phase-shift keying (PSK) is used for communication, such that no information is encoded in the signal amplitude, no estimation of is needed for decoding (but it can help to reduce the error probability). If quadrature amplitude modulation (QAM) is used instead, the user needs to learn also to decode the data. The unknown magnitude can be estimated blindly based on the received signals. Hence, no further pilot transmission is needed.
Reciprocity-based precoding
In this case, the user transmits a pilot signal in the uplink, which enables the BS to directly estimate the entire channel vector . For the sake of argument, suppose this estimation is perfect and that maximum ratio transmission with is used for downlink data transmission. The effective channel gain will then be
which is a positive scalar. Hence, the user only needs to learn the magnitude of because the phase is always zero. Estimation of can be implemented without downlink pilots, either by relying on channel hardening or by blind estimation based on the received signals. The former only works well in Massive MIMO with very many antennas, while the latter can be done in any system (including codebook-based precoding).
Conclusion
We generally need to compensate for the channel’s phase-shift at some place in a wireless system. In codebook-based precoding, the compensation is done at the user-side, based on the received signals from the downlink pilots. This is the main approach in 4G systems, which is why downlink pilots are so commonly used. In contrast, when using reciprocity-based precoding, the phase-shifts are compensated for at the BS-side using the uplink pilots. In either case, explicit pilot signals are only needed in one direction: uplink or downlink. If the estimation is imperfect, there will be some remaining phase ambiguity, which can be estimated blindly since we know that it is small (i.e., of all possible phase-rotations that could have resulted in the received signal, the smallest one is most likely).
When we have access to TDD spectrum, we can choose between the two precoding methods mentioned above. The reciprocity-based approach is preferable in terms of less overhead signaling; one pilot per user instead of one per index in the codebook (the codebook size needs to grow with the number of antennas), and no feedback is needed. That is why this approach is taken in the canonical form of Massive MIMO.
]]>Moreover, there is a series of research papers (e.g., Ref1, Ref2, Ref3) that treat the pilot and data powers as two separate optimization variables that can be optimized with respect to some performance metric, under a constraint on the total energy budget per transmission/coherence block. This gives the flexibility to “move” power from data to pilots for users at the cell edge, to improve the channel state information that the base station acquires and thereby the array gain that it obtains when decoding the data signals received over the antennas.
In some cases, it is theoretically preferable to assign, for example, 20 dB higher power to pilots than to data. But does that make practical sense, bearing in mind that non-linear amplifiers are used and the peak-to-average-power ratio (PAPR) is then a concern? The answer depends on how the pilots and data are allocated over the time-frequency grid. In OFDM systems, which have an inherently high PAPR, it is discouraged to have large power differences between OFDM symbols (i.e., consecutive symbols in the time domain) since this will further increase the PAPR. However, it is perfectly fine to assign the power in an unequal manner over the subcarriers.
In the OFDM literature, there are two elementary ways to allocate pilots: block and comb type arrangements. These are illustrated in the figure below and some early references on the topic are Ref4, Ref5, Ref6.
(a): In the block type arrangement, at a given OFDM symbol time, all subcarriers either contain pilots or data. It is then preferable for a user terminal to use the same transmit power for pilots and data, to not get a prohibitively high PAPR. This is consistent with the assumptions made in the book Massive MIMO networks.
(b): In the comb type arrangement, some subcarriers always contain pilots and other subcarriers always contain data. It is then possible to assign different power to pilots and data at a user terminal. The power can be moved from pilot subcarriers to data subcarriers or vice versa, without a major impact on the PAPR. This approach enables the type of unequal pilot and data power allocations considered in Fundamentals of Massive MIMO or research papers that optimize the pilot and data powers under a total energy budget per coherence block.
The downlink in LTE uses a variation of the two elementary pilot arrangements, as illustrated in (c). It is easiest described as a comb type arrangement where some pilots are omitted and replaced with data. The number of omitted pilots depend on how many antenna ports are used; the more antennas, the more similar the pilot arrangement becomes to the comb type. Hence, unequal pilot and power allocation is possible in LTE but maybe not as easy to implement as described above. 5G has a more flexible frame structure but supports the same arrangements as LTE.
In summary, uplink pilots and data can be transmitted at different power levels, and this flexibility can be utilized to improve the performance in Massive MIMO. It does, however, require that the pilots are arranged in practically suitable ways, such as the comb type arrangement.
]]>1. UAV cell selection and association
As depicted in Figure 1(a), most existing cellular BSs create a fixed beampattern towards the ground. Thanks to this, ground users tend to perceive a strong signal strength from nearby BSs, which they use for connecting to the network. Instead, aerial users such as the red drone in Figure 1(a) only receive weak sidelobe-generated signals from a nearby BS when flying above it. This results in a deployment planning issue as illustrated in Figure 1(b), where due to the radiation of a strong sidelobe, the tri-sector BSs located in the origin can be the preferred server for far-flung UAVs (red spots). Consequently, these UAVs might experience strong interference, since they perceive signals from a multiplicity of BSs with similar power.
On the other hand, thanks to their capability of beamforming the synchronization signals used for user association, massive MIMO systems ensure that aerial users generally connect to a nearby BS. This optimized association enhances the robustness of the mobility procedures, as well as the downlink and uplink data phases.
2. Downlink BS-to-UAV transmissions
During the downlink data phase, UAV users are very sensitive to the strong inter-cell interference generated from a plurality of BSs, which are likely to be in line-of-sight. This may result in performance degradation, preventing UAVs from receiving critical C&C information, which has an approximate rate requirement of 60-100 kbps. Indeed, Figure 2 shows how conventional cellular networks (‘SU’) can only guarantee 100 kbps to a mere 6% of the UAVs flying at 150 meters. A conventional massive MIMO system (‘mMIMO’) enhances the data rates, albeit only 33% of the UAVs reach 100 kbps when they fly at 300 meters. This is due to a well-known effect: pilot contamination. Such an effect is particularly severe in scenarios with UAV users, since they can create strong uplink interference to many line-of-sight BSs simultaneously. In contrast, the pilot contamination decays much faster with distance for ground UEs.
In a nutshell, Figure 2 tells us that complementing conventional massive MIMO with explicit inter-cell interference suppression (‘mMIMO w/ nulls’) is essential when supporting high UAVs. In a ‘mMIMO w/ nulls’ system, BSs incorporate additional signal processing features that enable them to perform a twofold task. First, leveraging channel directionality, BSs can spatially separate non-orthogonal pilots transmitted by different UAVs. Second, by dedicating a certain number of spatial degrees of freedom to place radiation nulls, BSs can mitigate interference on the directions corresponding to users in other cells that are most vulnerable to the BS’s interference. Indeed, these additional capabilities dramatically increase the percentage of UAVs that meet the 100 kbps requirement when these are flying at 300 m, from 33% (‘mMIMO’) to a whopping 98% (‘mMIMO w/ nulls’).
3. Uplink UAV-to-BS transmissions
Unlike the downlink, where UAVs should be protected to prevent a significant performance degradation, it is the ground users who we should care about in the uplink. This is because line-of-sight UAVs can generate strong interference towards many BSs, therefore overwhelming the weaker signals transmitted by non-line-of-sight ground users. The consequences of such a phenomenon are illustrated in Figure 3, where the uplink rates of ground users plummet as the number of UAVs increases.
Again, ‘mMIMO w/nulls’ – incorporating additional space-domain inter-cell interference suppression capabilities – can solve the above issue and guarantee a better performance for legacy ground users.
Overall, the efforts towards realizing aerial wireless networks are just commencing, and massive MIMO will likely play a key role. In the exciting era of fly-and-connect, we must revisit our understanding of cellular networks and develop novel architectures and techniques, catering not only for roads and buildings, but also for the sky.
]]>When reading papers on pilot (de)contamination written by many different authors, I’ve noticed one recurrent issue: the mean-squared error (MSE) is used to measure the level of pilot contamination. A few papers only plot the MSE, while most papers contain multiple MSE plots and then one or two plots with bit-error-rates or achievable rates. As I will explain below, the MSE is a rather poor measure of pilot contamination since it cannot distinguish between noise and pilot contamination.
A simple example
Suppose the desired uplink signal is received with power and is disturbed by noise with power and interference from another user with power . By varying the variable between 0 and 1 in this simple example, we can study how the performance changes when moving power from the noise to the interference, and vice versa.
By following the standard approach for channel estimation based on uplink pilots (see Fundamentals of Massive MIMO), the MSE for i.i.d. Rayleigh fading channels is
which is independent of and, hence, does not care about whether the disturbance comes from noise or interference. This is rather intuitive since both the noise and interference are additive i.i.d. Gaussian random variables in this example. The important difference appears in the data transmission phase, where the noise takes a new independent realization and the interference is strongly correlated with the interference in the pilot phase, because it is the product of a new scalar signal and the same channel vector.
To demonstrate the important difference, suppose maximum ratio combining is used to detect the uplink data. The effective uplink signal-to-interference-and-noise-ratio (SINR) is
where is the number of antennas. For any given MSE value, it now matters how it was generated, because the SINR is a decreasing function of . The term is due to pilot contamination (it is often called coherent interference) and is proportional to the interference power . When the number of antennas is large, it is far better to have more noise during the pilot transmission than more interference!
Implications
Since the MSE cannot separate noise from interference, we should not try to measure the effectiveness of a “pilot decontamination” algorithm by considering the MSE. An algorithm that achieves a low MSE can potentially be mitigating the noise, leaving the interference unaffected. If that is the case, the pilot contamination term will remain. The MSE has been used far too often when evaluating pilot decontamination algorithms, and a few papers (I found three while writing this post) did only consider the MSE, which opens the door for questioning their conclusions.
The right methodology is to compute the SINR (or some other performance indicator in the data phase) with the proposed pilot decontamination algorithm and with competing algorithms. In that case, we can be sure that the full impact of the pilot contamination is taken into account.
]]>And in this video clip I talk more in general about our book, Fundamentals of Massive MIMO:
Clearly, we will see a larger focus on TDD in future networks, but there are some traditional disadvantages with TDD that we need to bear in mind when designing these networks. I describe the three main ones below.
Link budget
Even if we allocate the same amount of time-frequency resources to uplink and downlink in TDD and FDD operation, there is an important difference. We transmit over half the bandwidth all the time in FDD, while we transmit over the whole bandwidth half of the time in TDD. Since the power amplifier is only active half of the time, if the peak power is the same, the average radiated power is effectively cut in half. This means that the SNR is 3 dB lower in TDD than in FDD, when transmitting at maximum peak power.
Massive MIMO systems are generally interference-limited and uses power control to assign a reduced transmit power to most users, thus the impact of the 3 dB SNR loss at maximum peak power is immaterial in many cases. However, there will always be some unfortunate low-SNR users (e.g., at the cell edge) that would like to communicate at maximum peak power in both uplink and downlink, and therefore suffer from the 3 dB SNR loss. If these users are still able to connect to the base station, the beamforming gain provided by Massive MIMO will probably more than compensate for the loss in link budget as compared single-antenna systems. One can discuss if it should be the peak power or average radiated power that is constrained in practice.
Guard period
Everyone in the cell should operate in uplink and downlink mode at the same time in TDD. Since the users are at different distances from the base station and have different delay spreads, they will receive the end of the downlink transmission block at different time instances. If a cell center user starts to transmit in the uplink immediately after receiving the full downlink block, then users at the cell edge will receive a combination of the delayed downlink transmission and the cell center users’ uplink transmissions. To avoid such uplink-downlink interference, there is a guard period in TDD so that all users wait with uplink transmission until the outmost users are done with the downlink.
In fact, the base station gives every user a timing bias to make sure that when the uplink commences, the users’ uplink signals are received in a time-synchronized fashion at the base station. Therefore, the outmost users will start transmitting in the uplink before the cell center users. Thanks to this feature, the largest guard period is needed when switching from downlink to uplink, while the uplink to downlink switching period can be short. This is positive for Massive MIMO operation since we want to use uplink CSI in the next downlink block, but not the other way around.
The guard period in TDD must become larger when the cell size increases, meaning that a larger fraction of the transmission resources disappears. Since no guard periods are needed in FDD, the largest benefits of TDD will be seen in urban scenarios where the macro cells have a radius of a few hundred meters and the delay spread is short.
Inter-cell synchronization
We want to avoid interference between uplink and downlink within a cell and the same thing applies for the inter-cell interference. The base stations in different cells should be fairly time-synchronized so that the uplink and downlink take place at the same time; otherwise, it might happen that a cell-edge user receives a downlink signal from its own base station and is interfered by the uplink transmission from a neighboring user that connects to another base station.
This can also be an issue between telecom operators that use neighboring frequency bands. There are strict regulations on the permitted out-of-band radiation, but the out-of-band interference can anyway be larger than the desired inband signal if the interferer is very close to the receiving inband user. Hence, it is preferred that the telecom operators are also synchronizing their switching between uplink and downlink.
Summary
Massive MIMO will bring great gains in spectral efficiency in future cellular networks, but we should not forget about the traditional disadvantages of TDD operation: 3 dB loss in SNR at peak power transmission, larger guard periods in larger cells, and time synchronization between neighboring base stations.
]]>Fortunately, the 1600 bit/sample that are effectively produced by 100 16-bit ADCs are much more than what is needed to communicate at practical SINRs. For this reason, there is plenty of research on Massive MIMO base stations equipped with lower-resolution ADCs. The use of 1-bit ADCs has received particular attention. Some good paper references are provided in a previous blog post: Are 1-bit ADCs sufficient? While many early works considered narrowband channels, recent papers (e.g., Quantized massive MU-MIMO-OFDM uplink) have demonstrated that 1-bit ADCs can also be used in practical frequency-selective wideband channels. I’m impressed by the analytical depth of these papers, but I don’t think it is practically meaningful to use 1-bit ADCs.
Do we really need 1-bit ADCs?
I think the answer is no in most situations. The reason is that ADCs with a resolution of around 6 bits strike a much better balance between communication performance and power consumption. The state-of-the-art 6-bit ADCs are already very energy-efficient. For example, the paper “A 5.5mW 6b 5GS/S 4×-lnterleaved 3b/cycle SAR ADC in 65nm CMOS” from ISSCC 2015 describes a 6-bit ADC that consumes 5.5 mW and has a huge sampling rate of 5 Gsample/s, which is sufficient even for extreme mmWave applications with 1 GHz of bandwidth. In a base station equipped with 100 of these 6-bit ADCs, less than 1 W is consumed by the ADCs. That will likely be a negligible factor in the total power consumption of any base station, so what is the point in using a lower resolution than that?
The use of 1-bit ADCs comes with a substantial loss in communication rate. In contrast, there is a consensus that Massive MIMO with 3-5 bits per ADC performs very close to the unquantized case (see Paper 1, Paper 2, Paper 3, Paper 4, Paper 5). The same applies for 6-bit ADCs, which provide an additional margin that protects against strong interference. Note that there is nothing magical with 6-bit ADCs; maybe 5-bit or 7-bit ADCs will be even better, but I don’t think it is meaningful to use 1-bit ADCs.
Will 1-bit ADCs ever become useful?
To select a 1-bit ADC, instead of an ADC with higher resolution, the energy consumption of the receiving device must be extremely constrained. I don’t think that will ever be the case in base stations, because the power amplifiers are dominating their energy consumption. However, the case might be different for internet-of-things devices that are supposed to run for ten years on the same battery. To make 1-bit ADCs meaningful, we need to greatly simplify all the other hardware components as well. One potential approach is to make a dedicated spatial-temporal waveform design, as described in this paper.
]]>The derivation was based on a very simple third-order polynomial model. Questioning that model, or contesting the conclusions? Let’s run WebLab. WebLab is a web-server-based interface to a real power amplifier operating in the lab, developed and run by colleagues at Chalmers University of Technology in Sweden. Anyone can access the equipment in real time (though there might be a queue) by submitting a waveform and retrieving the amplified waveform using a special Matlab function, “weblab.m”, obtainable from their webpages. Since accurate characterization and modeling of amplifiers is a hard nonlinear identification problem, WebLab is a great tool to researchers who want to go beyond polynomial and truncated Volterra-type toy models.
A -spaced uniform linear array with 50 elements beamforms in free space line-of-sight to two terminals at (arbitrarily chosen) angles -9 respectively +34 degrees. A sinusoid with frequency is sent to the first terminal, and a sinusoid with frequency is transmitted to the other terminal. (Frequencies are in discrete time, see the Weblab documentation for details.) The actual radiation diagram is computed numerically: line-of-sight in free space is fairly uncontroversial: superposition for wave propagation applies. However, importantly, the actual amplification all signals is run on actual hardware in the lab.
The computed radiation diagram is shown below. (Some lines overlap.) There are two large peaks at -9 and +34 degrees angle, corresponding to the two signals of interest with frequencies and . There are also secondary peaks, at angles approximately -44 and -64 degrees, at frequencies different from respectively . These peaks originate from intermodulation products, and represent the out-band radiation caused by the amplifier non-linearity. (Homework: read the paper and verify that these angles are equal to those predicted by the theory.)
The Matlab code for reproduction of this experiment can be downloaded here.
]]>Contemporary multiantenna base stations for cellular communications are equipped with 2-8 antennas, which are deployed along a horizontal line. One example is a uniform linear array (ULA), as illustrated in Figure 1 below, where the antenna spacing is uniform. All the antennas in the ULA have the same physical down-tilt, with respect to the ground, and a fixed radiation pattern and directivity.
By sending the same signal from all antennas, but with different phase-shifts, we can steer beams in different angular directions and thereby make the directivity of the radiated signal different from the directivity of the individual antennas. Since the antennas are deployed on a one-dimensional horizontal line in this example, the ULA can only steer beams in the two-dimensional (2D) azimuth plane as illustrated in Figure 1. The elevation angle is the same for all beams, which is why this is called 2D beamforming. The beamwidth in the azimuth domain shrinks the more antennas are deployed. If the array is used for multiuser MIMO, then multiple beams with different azimuth angles are created simultaneously, as illustrated by the colored beams in Figure 1.
If we would rotate the ULA so that the antennas are instead deployed at different heights above the ground, then the array can instead steer beams in different elevation angles. This is illustrated in Figure 2. Note that this is still a form of 2D beamforming since every beam will have the same directivity with respect to the azimuth plane. This antenna array can be used to steer beams towards users at different floors of a building. It is also useful to serve flying objects, such as UAVs, jointly with ground users. The beamwidth in the elevation domain shrinks the more antennas are deployed.
If we instead deploy multiple ULAs on top of each other, it is possible to control both the azimuth and elevation angle of a beam. This is called 3D beamforming (or full-dimensional MIMO) and is illustrated in Figure 3 using a planar array with a “massive” number of antennas. This gives the flexibility to not only steer beams towards different buildings but also towards different floors of these buildings, to provide a beamforming gain wherever the user is in the coverage area. It is not necessary to have many antennas to perform 3D beamforming – it is basically enough to have three antennas deployed in a triangle. However, as more antennas are added, the beams become narrower and easier to jointly steer in specific azimuth-elevation directions. This increases the array gain and reduces the interference between beams directed to different users, as illustrated by the colors in Figure 3.
The detailed answer to the question “3D Beamforming, is that Massive MIMO?” is as follows. Massive MIMO and 3D beamforming are two different concepts. 3D beamforming can be performed with few antennas and Massive MIMO can be deployed to only perform 2D beamforming. However, Massive MIMO and 3D beamforming is a great combination in many applications; for example, to spatially multiplex many users in a city with high-rise buildings. One should also bear in mind that, in general, only a fraction of the users are located in line-of-sight so the formation of angular beams (as shown above) might be of limited importance. The ability to control the array’s radiation pattern in 3D is nonetheless helpful to control the multipath environment such that the many signal components add constructively at the location of the intended receiver.
]]>It looks to me now that two of these speculations were wrong:
Have you found any more? Let me know. The knowledge in the field continues to evolve.
]]>Each antenna type has a predefined radiation pattern, which describes its inherent directivity; that is, how the gain of the emitted signal differs in different angular directions. An ideal isotropic antenna has no directivity, but a practical antenna always has a certain directivity, measured in dBi. For example, a half-wavelength dipole antenna has 2.15 dBi, which means that there is one angular direction in which the emitted signal is 2.15 dB stronger than it would be with a corresponding isotropic antenna. On the other hand, there are other angular directions in which the emitted signal is weaker. This is not a problem as long as there will not be any receivers in those directions.
In cellular communications, we are used to deploying large vertical antenna panels that cover a 120 degree horizontal sector and have a strong directivity of 15 dBi or more. Such a panel is made up of many small radiating elements, each having a directivity of a few dBi. By feeding them with the same input signal, a higher dBi is achieved for the panel. For example, if the panel consists of 8 patch antenna elements, each having 7 dBi, then you get a 7+10·log_{10}(8) = 16 dBi antenna.
The picture above shows a real LTE site that I found in Nanjing, China, a couple of years ago. Looking at it from above, the site is structured as illustrated to the right. The site consists of three sectors, each containing a base station with four vertical panels. If you would look inside one of the panels, you will (probably) find 8 cross-polarized vertically stacked radiating elements, as illustrated in Figure 1. There are two RF input signals per panel, one per polarization, thus each panel acts as two antennas. This is how LTE with 8TX-sectors is deployed: 4 panels with dual polarization per base station.
At the exemplified LTE site, there is a total of 8·8·3 =192 radiating elements, but only 8·3 = 24 antennas. This disparity can lead to a lot of confusion. The Massive MIMO version of the exemplified LTE site may have the same form factor, but instead of 24 antennas with 16 dBi, you would have 192 antennas with 7 dBi. More precisely, you would connect each of the existing radiating elements to a separate RF input signal to create a larger number of antennas. Therefore, I suggest to use the following antenna definition from the book Massive MIMO Networks:
Definition: An antenna consists of one or more radiating elements (e.g., dipoles) which are fed by the same RF signal. An antenna array is composed of multiple antennas with individual RF chains.
Note that, with this definition, an array that uses analog beamforming (e.g., a phased array) only constitutes one antenna. It is usually called an adaptive antenna since the radiation pattern can be changed over time, but it is nevertheless a single antenna. Massive MIMO for sub-6 GHz frequencies is all about adding RF chains (also known as antenna ports), while not necessarily adding more radiating elements than in a contemporary system.
What is the purpose of having more RF chains?
With more RF chains, you have more degrees of freedom to modify the radiation pattern of the transmitted signal based on where the receiver is located. When transmitting a precoded signal to a single user, you adjust the phases of the RF input signals to make them all combine constructively at the intended receiver.
The maximum antenna/array gain is the same when using one 16 dBi antenna and when using 8 antennas with 7 dBi. In the first case, the radiation pattern is usually static and thus only a line-of-sight user located in the center of the cell sector will obtain this gain. However, if the antenna is adaptive (i.e., supports analog beamforming), the main lobe of the radiation pattern can be also steered towards line-of-sight users located in other angular directions. This feature might be sufficient for supporting the intended single-user use-cases of mm-wave technology (see Figure 4 in this paper).
In contrast, in the second case, we can adjust the radiation pattern by 8-antenna precoding to deliver the maximum gain to any user in the sector. This feature is particularly important for non-line-of-sight users (e.g., indoor use-cases), for which the signals from the different radiating elements will likely be received with “random” phase shifts and therefore add non-constructively, unless we compensate for the phases by digital precoding.
Note that most papers on Massive MIMO keep the antenna gain constant when comparing systems with different number of antennas. There is nothing wrong with doing that, but one cannot interpret the single-antenna case in such a study as a contemporary system.
Another, perhaps more important, feature of having multiple RF chains is that we can spatially multiplex several users when having multiple antennas. For this you need at least as many RF inputs as there are users. Each of them can get the full array gain and the digital precoding can be also used to avoid inter-user interference.
]]>Can we utilize the channel hardening to estimate the channels less frequently?
Unfortunately, the answer is no. Whenever you move approximately half a wavelength, the multi-path propagation will change each element of the channel vector. The time it takes to move such a distance is called a coherence time. This time is the same irrespectively of how many antennas the base station has and, therefore, you still need to estimate the channel once per coherence time. The same applies to the frequency domain, where the coherence bandwidth is determined by the propagation environment and not the number of antennas.
The following flow-chart shows what need to happen in every channel coherence time:
When you get a new realization (at the top of the flow-chart), you compute an estimate (e.g., based on uplink pilots), then you use the estimate to compute a new receive combining vector and transmit precoding vector. It is when you have applied these vectors to the channel that the hardening phenomena appears; that is, the randomness averages out. If you use maximum ratio (MR) processing, then the random realization h_{1} of the channel vector turns into an almost deterministic scalar channel ||h_{1}||^{2}. You can communicate over the hardened channel with gain ||h_{1}||^{2} until the end of the coherence time. You then start over again by estimating the new channel realization h_{2}, applying MR precoding/combining again, and then you get ||h_{2}||^{2 }≈ ||h_{1}||^{2}.
In conclusion, channel hardening appears after coherent combining/precoding has been applied. To maintain a hardened channel over time (and frequency), you need to estimate and update the combining/precoding as often as you would do for a single-antenna channel. If you don’t do that, you will gradually lose the array gain until the point where the channel and the combining/precoding are practically uncorrelated, so there is no array gain left. Hence, there is more to lose from estimating channels too infrequently in Massive MIMO systems than in conventional systems. This is shown in Fig. 10 in a recent measurement paper from Lund University, where you see how the array gain vanishes with time. However, the Massive MIMO system will never be worse than the corresponding single-antenna system.
]]>Since the spectral efficiency (bit/s/Hz) and many other performance metrics of interest depend on the SNR, and not the individual values of the three parameters, it is a common practice to normalize one or two of the parameters to unity. This habit makes it easier to interpret performance expressions, to select reasonable SNR ranges, and to avoid mistakes in analytical derivations.
There are, however, situations when the absolute value of the transmitted/received signal power matters, and not the relative value with respect to the noise power, as measured by the SNR. In these situations, it is easy to make mistakes if you use normalized parameters. I see this type of errors far too often, both as a reviewer and in published papers. I will give some specific examples below, but I won’t tell you who has made these mistakes, to not point the finger at anyone specifically.
Wireless energy transfer
Electromagnetic radiation can be used to transfer energy to wireless receivers. In such wireless energy transfer, it is the received signal energy that is harvested by the receiver, not the SNR. Since the noise power is extremely small, the SNR is (at least) a billion times larger than the received signal power. Hence, a normalization error can lead to crazy conclusions, such as being able to transfer energy at a rate of 1 W instead of 1 nW. The former is enough to keep a wireless transceiver on continuously, while the latter requires you to harvest energy for a long time period before you can turn the transceiver on for a brief moment.
Energy efficiency
The energy efficiency (EE) of a wireless transmission is measured in bit/Joule. The EE is computed as the ratio between the data rate (bit/s) and the power consumption (Watt=Joule/s). While the data rate depends on the SNR, the power consumption does not. The same SNR value can be achieved over a long propagation distance by using high transmit power or over a short distance by using a low transmit power. The EE will be widely different in these cases. If a “normalized transmit power” is used instead of the actual transmit power when computing the EE, one can get EEs that are one million times smaller than they should be. As a rule-of-thumb, if you compute things correctly, you will get EE numbers in the range of 10 kbit/Joule to 10 Mbit/Joule.
Noise power depends on the bandwidth
The noise power is proportional to the communication bandwidth. When working with a normalized noise power, it is easy to forget that a given SNR value only applies for one particular value of the bandwidth.
Some papers normalize the noise variance and channel gain, but then make the SNR equal to the unnormalized transmit power (measured in W). This may greatly overestimate the SNR, but the achievable rates might still be in the reasonable range if you operate the system in an interference-limited regime.
Some papers contain an alternative EE definition where the spectral efficiency (bit/s/Hz) is divided by the power consumption (Joule/s). This leads to the alternative EE unit bit/Joule/Hz. This definition is not formally wrong, but gives the misleading impression that one can multiply the EE value with any choice of bandwidth to get the desired number of bit/Joule. That is not the case since the SNR only holds for one particular value of the bandwidth.
Knowing when to normalize
In summary, even if it is convenient to normalize system parameters in wireless communications, you should only do it if you understand when normalization is possible and when it is not. Otherwise, you can make embarrassing mistakes, such as submitting a paper where the results are six orders of magnitude wrong. And, unfortunately, there are several such papers that have been published and these create a bad circle by tricking others into making the same mistakes.
]]>Massive MIMO in Sub-6 GHz and mmWave: Physical, Practical, and Use-Case Differences
]]>Christopher Mollén recently defended his doctoral thesis entitled High-End Performance with Low-End Hardware: Analysis of Massive MIMO Base Station Transceivers. In the following video, he explains the basics of how the non-linear distortion from Massive MIMO transceivers is radiated in space.
]]>While there is no theoretical upper limit on how spectrally efficient Massive MIMO can become when adding more antennas, we need to set some reasonable first goals. Currently, many companies are trying to implement analog beamforming in a cost-efficient manner. That will allow for narrow beamforming, but not spatial multiplexing.
By following the methodology in Section 3.3.3 in Fundamentals of Massive MIMO, a simple formula for the downlink spectral efficiency is:
(1)
where is the number of base-station antennas, is the number of spatially multiplexed users, is the quality of the channel estimates, and is the number of channel uses per channel coherence block. For simplicity, I have assumed the same pathloss for all the users. The variable is the nominal signal-to-noise ratio (SNR) of a user, achieved when . Eq. (1) is a rigorous lower bound on the sum capacity, achieved under the assumptions of maximum ratio precoding, i.i.d. Rayleigh fading channels, and equal power allocation. With better processing schemes, one can achieve substantially higher performance.
To get an even simpler formula, let us approximate (1) as
(2)
by assuming a large channel coherence and negligible noise.
What does the formula tell us?
If we increase while is fixed , we will observe a logarithmic improvement in spectral efficiency. This is what analog beamforming can achieve for and, hence, I am a bit concerned that the industry will be disappointed with the gains that they will obtain from such beamforming in 5G.
If we instead increase and jointly, so that stays constant, then the spectral efficiency will grow linearly with the number of users. Note that the same transmit power is divided between the users, but the power-reduction per user is compensated by increasing the array gain so that the performance per user remains the same.
The largest gains come from spatial multiplexing
To give some quantitative numbers, consider a baseline system with and that achieves 2 bit/s/Hz. If we increase the number of antennas to , the spectral efficiency will become 5.6 bit/s/Hz. This is the gain from beamforming. If we also increase the number of users to users, we will get 32 bit/s/Hz. This is the gain from spatial multiplexing. Clearly, the largest gains come from spatial multiplexing and adding many antennas is a necessary way to facilitate such multiplexing.
This analysis has implicitly assumed full digital beamforming. An analog or hybrid beamforming approach may achieve most of the array gain for . However, although hybrid beamforming allows for spatial multiplexing, I believe that the gains will be substantially smaller than with full digital beamforming.
]]>The recipe is to compute the capacity bound, and depending on the code blocklength, add a dB or a few, to the required SNR. That gives the link performance prediction. The coding literature is full of empirical results, showing how far from capacity a code of a given block length is for the AWGN channel, and this gap is usually not extremely different for other channel models – although, one should always check this.
But there are three main caveats with this:
How far are the bounds from the actual capacity typically? Nobody knows, but there are good reasons to believe they are extremely close. Here (Figure 1) is a nice example that compares a decoder that uses the measured channel likelihood, instead of assuming a Gaussian (which is implied by the typical bounding techniques). From correspondence with one of the authors: “The dashed and solid lines are the lower bound obtained by Gaussianizing the interference, while the circles are the rate achievable by a decoder exploiting the non-Gaussianity of the interference, painfully computed through days-long Monte-Carlo. (This is not exactly the capacity, because the transmit signals here are Gaussian, so one could deviate from Gaussian signaling and possibly do slightly better — but the difference is imperceptible in all the experiments we’ve done.)”
Concerning Massive MIMO and its capacity bounds, I have met for a long time with arguments that these capacity formulas aren’t useful estimates of actual performance. But in fact, they are: In one simulation study we were less than one dB from the capacity bound by using QPSK and a standard LDPC code (albeit with fairly long blocks). This bound accounts for noise and channel estimation errors. Such examples are in Chapter 1 of Fundamentals of Massive MIMO, and also in the ten-myth paper:
(I wrote the simulation code, and can share it, in case anyone would want to reproduce the graphs.)
So in summary, while capacity bounds are sometimes done wrong; when done right they give pretty good estimates of actual link performance with modern coding.
(With thanks to Angel Lozano for discussions.)
]]>Interestingly, some radio resource allocation problems that appear to have exponential complexity can be relaxed to a form that is much easier to solve – this is what I call “relax and conquer”. In optimization theory, relaxation means that you widen the set of permissible solutions to the problem, which in this context means that the discrete optimization variables are replaced with continuous optimization variables. In many cases, it is easier to solve optimization problems with variables that take values in continuous sets than problems with a mix of continuous and discrete variables.
A basic example of this principle arises when communicating over a single-user MIMO channel. To maximize the achievable rate, you first need to select how many data streams to spatially multiplex and then determine the precoding and power allocation for these data streams. This appears to be a mixed-integer optimization problem, but Telatar showed in his seminal paper that it can be solved by the water-filling algorithm. More precisely, you relax the problem by assuming that the maximum number of data streams are transmitted and then you let the solution to a convex optimization problem determine how many of the data streams that are assigned non-zero power; this is the optimal number of data streams. Despite the relaxation, the global optimum to the original problem is obtained.
There are other, less known examples of the “relax and conquer” method. Some years ago, I came across the paper “Jointly optimal downlink beamforming and base station assignment“, which has received much less attention than it deserves. The UE-BS association problem, considered in this paper, is non-trivial since some BSs might have many more UEs in their vicinity than other BSs. Nevertheless, the paper shows that one can solve the problem by first relaxing it so that all BSs transmit to all the UEs. The author formulates a relaxed optimization problem where the beamforming vectors (including power allocation) are selected to satisfy each UEs’ SINR constraint, while minimizing the total transmit power. This problem is solved by convex optimization and, importantly, the optimal solution is always such that each UE only receives a non-zero signal power from one of the BSs. Hence, the seemingly difficult combinatorial UE-BS association problem is relaxed to a convex optimization problem, which provides the optimal solution to the original problem!
I have reused this idea in several papers. The first example is “Massive MIMO and Small Cells: Improving Energy Efficiency by Optimal Soft-cell Coordination“, which considers a similar setup but with a maximum transmit power per BS. The consequence of including this practical constraint is that it might happen that some UEs are served by multiple BSs at the optimal solution. These BSs send different messages to the UE, which decode them by successive interference cancelation, thus the solution is still practically achievable.
One practical weakness with the two aforementioned papers is that they take small-scale fading realizations into account in the optimization, thus the problem must be solved once per coherence interval, requiring extremely high computational power. More recently, in the paper “Joint Power Allocation and User Association Optimization for Massive MIMO Systems“, we applied the same “relax and conquer” method to Massive MIMO, but targeting lower bounds on the downlink ergodic capacity. Since the capacity bounds are valid as long as the channel statistics are fixed (and the same UEs are active), our optimized BS-UE association can be utilized for a relatively long time period. This makes the proposed algorithm practically relevant, in contrast to the prior works that are more of academic interest.
Another example of the “relax and conquer” method is found in the paper “Joint Pilot Design and Uplink Power Allocation in Multi-Cell Massive MIMO Systems”. We consider the assignment of orthogonal pilot sequences to users, which appears to be a combinatorial problem. Instead of assigning a pilot sequence to each UE and then allocate power, we relax the problem by allowing each user to design its own pilot sequence, which is a linear combination of the original orthogonal sequences. Hence, a pair of UEs might have partially overlapping sequences, instead of either identical or orthogonal sequences (as in the original problem). The relaxed problem even allows for pilot contamination within a cell. The sequences are then optimized to maximize the max-min performance. The resulting problem is non-convex, but the combinatorial structure has been relaxed so that there are only optimization variables from continuous sets. A local optimum to the joint pilot assignment and power control problem is found with polynomial complexity, using standard methods from the optimization literature. The optimization might not lead to a set of orthogonal pilot sequences, but the solution is practically implementable and gives better performance.
]]>In most cases, the receiver only has imperfect CSI and then it is harder to measure the performance. In fact, it took me years to understand this properly. To explain the complications, consider the uplink of a single-cell Massive MIMO system with single-antenna users and antennas at the base station. The received -dimensional signal is
where is the unit-power information signal from user , is the fading channel from this user, and is unit-power additive Gaussian noise. In general, the base station will only have access to an imperfect estimate of , for
Suppose the base station uses to select a receive combining vector for user . The base station then multiplies it with to form a scalar that is supposed to resemble the information signal :
From this expression, a common mistake is to directly say that the SINR is
which is obtained by computing the power of each of the terms (averaged over the signal and noise), and then claim that is an achievable rate (where the expectation is with respect to the random channels). You can find this type of arguments in many papers, without proof of the information-theoretic achievability of this rate value. Clearly, is an SINR, in the sense that the numerator contains the total signal power and the denominator contains the interference power plus noise power. However, this doesn’t mean that you can plug into “Shannon’s capacity formula” and get something sensible. This will only yield a correct result when the receiver has perfect CSI.
A basic (but non-conclusive) test of the correctness of a rate expression is to check that the receiver can compute the expression based on its available information (i.e., estimates of random variables and deterministic quantities). Any expression containing fails this basic test since you need to know the exact channel realizations to compute it, although the receiver only has access to the estimates.
What is the right approach?
Remember that the SINR is not important by itself, but we should start from the performance metric of interest and then we might eventually interpret a part of the expression as an effective SINR. In Massive MIMO, we are usually interested in the ergodic capacity. Since the exact capacity is unknown, we look for rigorous lower bounds on the capacity. There are several bounding techniques to choose between, whereof I will describe the two most common ones.
The first lower bound on the uplink capacity can be applied when the channels are Gaussian distributed and are the MMSE estimates with the corresponding estimation error covariance matrices . The ergodic capacity of user is then lower bounded by
Note that this expression can be computed at the receiver using only the available channel estimates (and deterministic quantities). The ratio inside the logarithm can be interpreted as an effective SINR, in the sense that the rate is equivalent to that of a fading channel where the receiver has perfect CSI and an SNR equal to this effective SINR. A key difference from is that only the part of the desired signal that is received along the estimated channel appears in the numerator of the SINR, while the rest of the desired signal appears as in the denominator. This is the price to pay for having imperfect CSI at the receiver, according to this capacity bound, which has been used by Hoydis et al. and Ngo et al., among others.
The second lower bound on the uplink capacity is
which can be applied for any channel fading distribution. This bound provides a value close to when there is substantial channel hardening in the system, while will greatly underestimate the capacity when varies a lot between channel realizations. The reason is that to obtain this bound, the receiver detects the signal as if it is received over a non-fading channel with gain (which is deterministic and thus known in theory, and easy to measure in practice), but there are no approximations involved so is always a valid bound.
Since all the terms in are deterministic, the receiver can clearly compute it using its available information. The main merit of is that the expectations in the numerator and denominator can sometimes be computed in closed form; for example, when using maximum-ratio and zero-forcing combining with i.i.d. Rayleigh fading channels or maximum-ratio combining with correlated Rayleigh fading. Two early works that used this bound are by Marzetta and by Jose et al..
The two uplink rate expressions can be proved using capacity bounding techniques that have been floating around in the literature for more than a decade; the main principle for computing capacity bounds for the case when the receiver has imperfect CSI is found in a paper by Medard from 2000. The first concise description of both bounds (including all the necessary conditions for using them) is found in Fundamentals of Massive MIMO. The expressions that are presented above can be found in Section 4 of the new book Massive MIMO Networks. In these two books, you can also find the right ways to compute rigorous lower bounds on the downlink capacity in Massive MIMO.
In conclusion, to avoid mistakes, always start with rigorously computing the performance metric of interest. If you are interested in the ergodic capacity, then you start from one of the canonical capacity bounds in the above-mentioned books and verify that all the required conditions are satisfied. Then you may interpret part of the expression as an SINR.
]]>These arguments, it turned out, all proved to be wrong. In 2017, Massive MIMO was the main physical-layer technology under standardization for 5G, and it is unlikely that any serious future cellular wireless communications system would not have Massive MIMO as a main technology component.
But Massive MIMO is more than a groundbreaking technology for wireless communications: it is also an elegant and mathematically rigorous approach to teaching wireless communications. In the moderately-large number-of-antennas regime, our closed-form capacity bounds become convenient proxies for the link performance achievable with practical coding and modulation.
These expressions take into account the effects of all significant physical phenomena: small-scale and large-scale fading, intra- and inter-cell interference, channel estimation errors, pilot reuse (also known as pilot contamination) and power control. A comprehensive analytical understanding of these phenomena simply has not been possible before, as the corresponding information theory has too complicated for any practical use.
The intended audiences of Fundamentals of Massive MIMO are engineers and students. I anticipate that as graduate courses on the topic become commonplace, our extensive problem set (with solutions) available online will serve as a useful resource to instructors. While other books and monographs will likely appear down the road, focusing on trendier and more recent research, Fundamentals of Massive MIMO distills the theory and facts that will prevail for the foreseeable future. This, I hope, will become its most lasting impact.
To read the preface of Fundamentals of Massive MIMO, click here. You can also purchase the book here.
]]>The full title of my webinar is Massive MIMO for 5G below 6 GHz: Achieving Spectral Efficiency, Link Reliability, and Low-Power Operation. I will cover the basics of Massive MIMO and explain how the technology is not only great for enhancing the broadband access, but also for delivering the link reliability and low-power operation required by the internet of things. I have made sure that the overlap with my previous webinar is small.
If you watch the webinar live, you will have the chance to ask questions. Otherwise, you can view the recording of the webinar afterward. All the webinars in the IEEE 5G Webinar Series are available for anyone to view.
As a final note, I wrote a guest blog post at IEEE ComSoc Technology News in late December. It follows up and my previous blog post about GLOBECOM and is called: The Birth of 5G: What to do next?
]]>
There are many foreseen applications that involve a large number of drones in a limited area such as disaster management, traffic monitoring, crowd management, and crop monitoring. The major communication requirements of most of the drone networks are: several tens of Mbps throughput for streaming high-resolution video, low latency for command and control, highly reliable connectivity in a three-dimensional coverage area, high-mobility support, and simultaneous support for a large number of drones.
The existing wireless systems are unsuitable for communicating with a large number of drones in long-range, high throughput, and high-altitude applications for the following reasons:
For the above-mentioned reasons, instead of borrowing from existing wireless technologies, it would be better to develop a new technology, considering the specific drone networks’ requirements and propagation characteristics. As of now, spectrum allocation and standardization efforts for drone communication networks are in the initial stage of development. This is where Massive MIMO can play a key role. The attractive features of Massive MIMO, such as spatial multiplexing and range extension, can be exploited to design flexible and efficient drone communication systems. 5G is based on the concept of network slicing, where the network can be configured differently depending on the use case. Therefore, it is possible to deploy a variation of 5G for drone communications along with appropriately tilted antenna arrays to provide connectivity to the drones flying at high altitudes.
In our recent papers (1 and 2), we illustrated the use for Massive MIMO for drone communications. From these papers, we make the following observations:
Below are some examples of use cases of Massive MIMO enabled drone communication systems. The technical details of Massive MIMO based system design can be found in this paper. The Massive MIMO design parameters for some of the use cases can be found in this paper.
Drone racing: In recent years, drone racing, also called “the sport of the future”, is becoming popular around the world. In drone racing, low latency is important for drone control, because even a few tens of milliseconds delay might crash the drone when it moves at the speed of 40-50 m/s. Interestingly, in our digital world, analog transmission is used for sending videos from racing drones to the pilots. The reason is that, unlike digital transmission, an analog transmission does not incur any processing delay and the overall latency is about only 15 ms. Currently, the 5.8 GHz band (5650 MHz to 5925 MHz) is used for drone racing. The transmitter and receiver use frequency modulation and it requires 40 MHz frequency separation to avoid cross-talks between neighboring channels. As a result, the number of simultaneous drones in a contest is limited to eight. The video quality is also poor. By using Massive MIMO, several tens of drones can simultaneously participate in a contest and the pilots can enjoy latency-free high-quality video transmission.
Sports streaming: Utilizing drones for sports streaming will change the way we view the sports events. High resolution 4K 360-degree videos taken by multiple drones at different angles can be broadcasted to enable the viewers to have an entirely a new experience. If there are 20 drones covering a sports event, the required sum throughput will be in the order of 10 Gbps. Massive MIMO in the mm-wave frequency band can be used to achieve this high throughput. This can become reality as already there are signs towards the use of drones for covering sports events. For instance, during the 2018 Winter Olympics, drones will be extensively used.
Surveillance/ Search and Rescue/Disaster management: During natural disasters, a network of drones can be quickly deployed to enable the rescue teams to assess the situation in real-time via high-resolution video streaming. Depending on the area to be covered and desired video quality, the sum throughput requirement will be in the order of Gbps. A Massive MIMO array deployed over a ground vehicle or a large aerial vehicle can be used for serving a swarm of drones.
Aerial survey: A swarm of drones can be used for high-resolution aerial imagery of several kilometers of landscape. There are many uses of aerial survey, including state governance, city planning, 3D cartography, and crop monitoring. Massive MIMO can be an enabler for such high throughput and long-range applications.
Backhaul for flying base stations: During emergency situations and heavy traffic conditions, UAVs could be used as flying base stations to provide wireless connectivity to the cellular users. A Massive MIMO base station can act as a high-capacity backhaul to a large number of flying base stations.
Space exploration: Currently, it takes several hours to receive a photo taken by the Curiosity Mars rover. It is possible to use Massive MIMO to reduce the overall transmission delay. For example, by using a massive antenna array deployed in an orbiter (see the above figure), a swarm of drones and rovers roaming on the surface of another planet can send videos and images to earth. The array can be used to spatially multiplex the uplink transmission from the drones (and possibly the rovers) to the orbiter. Note that the distance between the Mars surface and the orbiter is about 400 km. If the drones fly at an altitude of a few hundred meters and spread out over the region with a few hundred kilometers of radius, the angular resolution of the array is sufficient for spatial multiplexing. The array can be used to transmit the collected images and videos to earth by exploiting the array gain. This might sound like a science fiction, but NASA is already developing a 256 element antenna array for future Mars rovers to enable direct communication with the earth.
]]>I attended GLOBECOM in Singapore earlier this week. Since more and more preprints are posted online before conferences, one of the unique features of conferences is to meet other researchers and attend the invited talks and interactive panel discussions. This year I attended the panel “Massive MIMO – Challenges on the Path to Deployment”, which was organized by Ian Wong (National Instruments). The panelists were Amitava Ghosh (Nokia), Erik G. Larsson (Linköping University), Ali Yazdan (Facebook), Raghu Rao (Xilinx), and Shugong Xu (Shanghai University).
No common definition
The first discussion item was the definition of Massive MIMO. While everyone agreed that the main characteristic is that the number of controllable antenna elements is much larger than the number of spatially multiplexed users, the panelists put forward different additional requirements. The industry prefers to call everything with at least 32 antennas for Massive MIMO, irrespective of whether the beamforming is constructed from codebook-based feedback, grid-of-beams, or by exploiting uplink pilots and TDD reciprocity. This demonstrates that Massive MIMO is becoming a marketing term, rather than a well-defined technology. In contrast, academic researchers often have more restrictive definitions; Larsson suggested to specifically include the TDD reciprocity approach in the definition. This is because it is the robust and overhead-efficient way to acquire channel state information (CSI), particularly for non-line-of-sight users; see Myth 3 in our magazine paper. This narrow definition clearly rules out FDD operation, as pointed out by a member of the audience. Personally, I think that any multi-user MIMO implementation that provides performance similar to the TDD-reciprocity-based approach deserves the Massive MIMO branding, but we should not let marketing people use the name for any implementation just because it has many antennas.
Important use cases
The primary use cases for Massive MIMO in sub-6 GHz bands are to improve coverage and spectral efficiency, according to the panel. Great improvements in spectral efficiency have been demonstrated by prototyping, but the panelist agreed that these should be seen as upper bounds. We should not expect to see more than 4x improvements over LTE in the first deployments, according to Ghosh. Larger gains are expected in later releases, but there will continue to be a substantial gap between the average spectral efficiency observed in real cellular networks and the peak spectral efficiency demonstrated by prototypes. Since Massive MIMO achieves its main spectral efficiency gains by multiplexing of users, we might not need a full-blown Massive MIMO implementation today, when there are only one or two simultaneously active users in most cells. However, the networks need to evolve over time as the number of active users per cell grows.
In mmWave bands, the panel agreed that Massive MIMO is mainly for extending coverage. The first large-scale deployments of Massive MIMO will likely aim at delivering fixed wireless broadband access and this must be done in the mmWave bands; there is too little bandwidth in sub-6 GHz bands to deliver data rates that can compete with wired DSL technology.
Initial cost considerations
The deployment cost is a key factor that will limit the first generations of Massive MIMO networks. Despite all the theoretic research that has demonstrated that each antenna branch can be built using low-resolution hardware, when there are many antennas, one should not forget the higher out-of-band radiation that it can lead to. We need to comply with the spectral emission masks – spectrum is incredibly expensive so a licensee cannot accept interference from adjacent bands. For this reason, several panelists from the industry expressed the view that we need to use similar hardware components in Massive MIMO as in contemporary base stations and, therefore, the hardware cost grows linearly with the number of antennas. On the other hand, Larsson pointed out that the futuristic devices that you could see in James Bond movies 10 years ago can now be bought for $100 in any electronic store; hence, when the technology evolves and the economy of scale kicks in, the cost per antenna should not be more than in a smartphone.
A related debate is the one between analog and digital beamforming. Several panelists said that analog and hybrid approaches will be used to cut cost in the first deployments. To rely on analog technology is somewhat weird in an age when everything is becoming digital, but Yazdan pointed out that it is only a temporary solution. The long-term vision is to do fully digital beamforming, even in mmWave bands.
Another implementation challenge that was discussed is the acquisition of CSI for mobile users. This is often brought up as a showstopper since hybrid beamforming methods have such difficulties – it is like looking at a running person in a binocular and trying to follow the movement. This is a challenging issue for any radio technology, but if you rely on uplink pilots for CSI acquisition, it will not be harder than in a system of today. This has also been demonstrated by measurements.
Open problems
The panel was asked to describe the most important open problems in the Massive MIMO area, from a deployment perspective. One obvious issue, which we called the “grand question” in a previous paper, is to provide better support for Massive MIMO in FDD.
The control plane and MAC layer deserve more attention, according to Larsson. Basic functionalities such as ACK/NACK feedback is often ignored by academia, but incredibly important in practice.
The design of “cell-free” densely distributed Massive MIMO systems also deserve further attention. Connecting all existing antennas together to perform joint transmission seems to be the ultimate approach to wireless networks. Although there is no practical implementation yet, Yazdan stressed that deploying such networks might actually be more practical than it seems, given the growing interest in C-RAN technology.
10 years from now
I asked the panel what will be the status of Massive MIMO in 10 years from now. Rao predicted that we will have Massive MIMO everywhere, just as all access point supports small-scale MIMO today. Yazdan believed that the different radio technology (e.g., WiFi, LTE, NR) will converge into one interconnected system, which also allows operators to share hardware. Larsson thinks that over the next decade many more people will have understood the fundamental benefits of utilizing TDD and channel reciprocity, which will have a profound impact on the regulations and spectrum allocation.
]]>Unfortunately, there was not enough time for me to answer all the questions that I received, so I had to answer many of them afterwards. I have gathered ten questions and my answers below. I can also announce that I will give another Massive MIMO webinar in January 2018 and it will also be followed by a Q/A session.
1. What are the differences between 4G and 5G that will affect how Massive MIMO can be implemented?
The channel estimation must be implemented in the right way (i.e., exploiting uplink pilots and channel reciprocity) to obtain sufficiently accurate channel state information (CSI) to perform spatial multiplexing of many users, otherwise the inter-user interference will eliminate most of the gains. Accurate CSI is hard to achieve within the 4G standard, although there are several Massive MIMO field trials for TDD LTE that show promising results. However, if 5G is designed properly, it will support Massive MIMO from scratch, while in 4G it will always be an add-on that must to adhere to the existing air interface.
2. How easy it is to deploy MIMO antennas on the current infrastructure?
Generally speaking, we can reuse the current infrastructure when deploying Massive MIMO, which is why operators show much interest in the technology. You upgrade the radio base stations but keep the same backhaul infrastructure and core network. However, since Massive MIMO supports much higher data rates, some of the backhaul connections might also need to be upgraded to deliver these rates.
3. What are the most suitable channel models for Massive MIMO?
I recommend the channel model that was developed in the MAMMOET project. It is a refinement of the COST 2100 model that takes particular phenomena of having large antenna arrays into account. Check out Deliverable D1.2 from that project.
4. For planar arrays, what is the height to width ratio that gives the highest performance?
You typically need more antennas in the horizontal direction (width) than in the vertical direction (height), because the angular variations between users is larger in the horizontal domain. For example, the array might cover a horizontal sector of 120-180 degrees, while the users’ elevation angles might only differ by a few tens of degrees. This is the reason that 8-antenna LTE base stations use linear arrays in the horizontal direction.
There is no optimal answer to the question. It depends on the deployment scenario. If you have high-rise buildings, users at different floors can have rather different elevation angles (it can differ up to 90 degrees) and you can benefit more from having many antennas in the vertical direction. If all users have almost the same elevation angle, it is preferable to have many antennas in the horizontal direction. These things are further discussed in Sections 7.3 and 7.4 in my new book.
5. What are the difficulties we face in deploying Massive MIMO in FDD systems?
The difficulty is to acquire channel state information at the base station for the frequency band used in the downlink, since it is very resource-demanding to send downlink pilots from a large array; particularly, if you want to spatially multiplex many users. This is an important but challenging problem that researchers have been working on since the 1990s. You can read more about it in Myth 3 and the grand question in the paper Massive MIMO: ten myths and one grand question.
6. Do you believe that there is a value in coordinated resource allocation schemes for Massive MIMO?
Yes, but the resource allocation in Massive MIMO is different from conventional systems. Scheduling might not be so important, since you can multiplex many users spatially, but pilot assignment and power allocation are important aspects that must be addressed. I refer to these things as spatial resource allocation. You can read more about this in Sections 7.1 and 7.2 in my new book, but as you can see from those sections, there are many open problems to be solved.
7. What is channel hardening and what implications does it have on the frequency allocation (in OFDMA networks, for example)?
Channel hardening means that the effective channel after beamforming is almost constant so that the communication link behaves as if there is no small-scale fading. A consequence is that all frequency subcarriers provide almost the same channel quality to a user. Regarding channel assignment, since you can multiplex many tens of users spatially in Massive MIMO, you can assign the entire bandwidth (all subcarriers) to every user; there is no need to use OFDMA to allocate orthogonal frequency resources to the users.
8. Is it practical to estimate the channel for each subcarrier in an OFDM system?
To limit the pilot overhead, you typically place pilots only on a small subset of the subcarriers. The distance between the pilots in the frequency domain can be selected based on how frequency-selective the channels are; if a user has L strong channel taps, it is sufficient to send pilots on L subcarriers, even if you many more subcarriers than that. Based on the received pilot signals, one can either estimate the channels on every subcarrier or estimate the channels on some of them and interpolate to get estimates on the remaining subcarriers.
9. How sensitive are the Massive MIMO spectral efficiency gains to TDD frame synchronization?
If you consider an OFDM system, then timing synchronization mismatches that are smaller than the cyclic prefix can basically be ignored. This is the case in TDD LTE systems and will not change when considering Massive MIMO systems that are implemented using OFDM. However, the synchronization across cells will not be perfect. The implications are investigated in a recent paper.
10. How does the higher computational complexity and delay in Massive MIMO processing affect the system performance?
I used to think that the computational complexity would be a bottleneck, but it turns out that it is not a big deal since all of the operations are standard (i.e., matrix multiplications and matrix inversions). For example, the circuit that was developed at Lund University shows that MIMO detection and precoding for a 20 MHz channel can be implemented very efficiently and only consumes a few mW.
]]>Other Massive MIMO videos can be found on our Youtube channel.
]]>In November, the upcoming Massive MIMO webinars are:
Massive MIMO for 5G: How Big Can it Get? by Emil Björnson (Linköping University), Thursday, 9 November 2017, 3:00 PM EST, 12:00 PM PST, 20:00 GMT.
Real-time Prototyping of Massive MIMO: From Theory to Reality by Douglas Kim (NI) and Fredrik Tufvesson (Lund University), Wednesday, 15 November 2017, 12:00 PM EST, 9:00 AM PST, 17:00 GMT.
]]>Until recently, a more rigorous analysis was unavailable. Some weeks ago the authors of this paper argued, that all things considered, the use of superimposed pilots does not offer any appreciable gains for practically interesting use cases. The analysis was based on a capacity-bounding approach for finite numbers of antennas and finite channel coherence, but it assumed the most basic form of signal processing for detection and decoding.
There still remains some hope of seeing improvements, by implementing more advanced signal processing, like zero-forcing, multicell MMSE decoding, or iterative decoding algorithms, perhaps involving “turbo” information exchange between the demodulator, channel estimation, and detector. It will be interesting to follow future work by these two groups of authors to understand how large improvements (if any) superimposed pilots eventually can give.
There are, at least, two general lessons to learn here. First, that performance predictions based on asymptotics can be misleading in practically relevant cases. (I have discussed this issue before.) The best way to perform analysis is to use rigorous capacity lower bounds, or possibly, in isolated cases of interest, link-level simulations with channel coding (for which, as it turns out, capacity bounds are a very good proxy). Second, more concretely, that while it may be tempting, to superimpose-squeeze multiple symbols into the same time-frequency-space resource, once all sources of impairments (channel estimation errors, interference) are accurately accounted for, the gains tend to evaporate. (It is for the same reason that NOMA offers no substantial gains in MIMO systems – a topic that I may return to at a later time.)
]]>I sometimes get the question “Isn’t Massive MIMO just MU-MIMO with more antennas?” My answer is no, because the key benefit of Massive MIMO over conventional MU-MIMO is not only about the number of antennas. Marzetta’s Massive MIMO concept is the way to deliver the theoretical gains of MU-MIMO under practical circumstances. To achieve this goal, we need to acquire accurate channel state information, which in general can only be done by exploiting uplink pilots and channel reciprocity in TDD mode. Thanks to the channel hardening and favorable propagation phenomena, one can also simplify the system operation in Massive MIMO.
Six key differences between conventional MU-MIMO and Massive MIMO are provided below.
Conventional MU-MIMO | Massive MIMO | |
Relation between number of BS antennas (M) and users (K) | M ≈ K and both are small (e.g., below 10) | M ≫ K and both can be large (e.g., M=100 and K=20). |
Duplexing mode | Designed to work with both TDD and FDD operation | Designed for TDD operation to exploit channel reciprocity |
Channel acquisition | Mainly based on codebooks with set of predefined angular beams | Based on sending uplink pilots and exploiting channel reciprocity |
Link quality after precoding/combining | Varies over time and frequency, due to frequency-selective and small-scale fading | Almost no variations over time and frequency, thanks to channel hardening |
Resource allocation | The allocation must change rapidly to account for channel quality variations | The allocation can be planned in advance since the channel quality varies slowly |
Cell-edge performance | Only good if the BSs cooperate | Cell-edge SNR increases proportionally to the number of antennas, without causing more inter-cell interference |
Footnote: TDD stands for time-division duplex and FDD stands for frequency-division duplex.
]]>One answer is that beamforming and precoding are two words for exactly the same thing, namely to use an antenna array to transmit one or multiple spatially directive signals.
Another answer is that beamforming can be divided into two categories: analog and digital beamforming. In the former category, the same signal is fed to each antenna and then analog phase-shifters are used to steer the signal emitted by the array. This is what a phased array would do. In the latter category, different signals are designed for each antenna in the digital domain. This allows for greater flexibility since one can assign different powers and phases to different antennas and also to different parts of the frequency bands (e.g., subcarriers). This makes digital beamforming particularly desirable for spatial multiplexing, where we want to transmit a superposition of signals, each with a separate directivity. It is also beneficial when having a wide bandwidth because with fixed phases the signal will get a different directivity in different parts of the band. The second answer to the question is that precoding is equivalent to digital beamforming. Some people only mean analog beamforming when they say beamforming, while others use the terminology for both categories.
A third answer is that beamforming refers to a single-user transmission with one data stream, such that the transmitted signal consists of one main-lobe and some undesired side-lobes. In contrast, precoding refers to the superposition of multiple beams for spatial multiplexing of several data streams.
A fourth answer is that beamforming refers to the formation of a beam in a particular angular direction, while precoding refers to any type of transmission from an antenna array. This definition essentially limits the use of beamforming to line-of-sight (LoS) communications, because when transmitting to a non-line-of-sight (NLoS) user, the transmitted signal might not have a clear angular directivity. The emitted signal is instead matched to the multipath propagation so that the multipath components that reach the user add constructively.
A fifth answer is that precoding consists of two parts: choosing the directivity (beamforming) and choosing the transmit power (power allocation).
I used to use the word beamforming in its widest meaning (i.e., the first answer), as can be seen in my first book on the topic. However, I have since noticed that some people have a more narrow or specific interpretation of beamforming. Therefore, I nowadays prefer only talking about precoding. In Massive MIMO, I think that precoding is the right word to use since what I advocate is a fully digital implementation, where the phases and powers can be jointly designed to achieve high capacity through spatial multiplexing of many users, in both NLoS and LOS scenarios.
]]>The sub-6 GHz spectrum is particularly useful to provide network coverage, since the pathloss and channel coherence time are relatively favorable at such frequencies (recall that the coherence time is inversely proportional to the carrier frequency). Massive MIMO at sub-6 GHz spectrum can increase the efficiency of highly loaded cells, by upgrading the technology at existing base stations. In contrast, the huge available bandwidths in mmWave bands can be utilized for high-capacity services, but only over short distances due to the severe pathloss and high noise power (which is proportional to the bandwidth). Massive MIMO in mmWave bands can thus be used to improve the link budget.
Six key differences between sub-6 GHz and mmWave operation are provided below:
Sub-6 GHz | mmWave | |
Deployment scenario | Macro cells with support for high user mobility | Small cells with low user mobility |
Number of simultaneous users per cell | Up to tens of users, due to the large coverage area | One or a few users, due to the small coverage area |
Main benefit from having many antennas | Spatial multiplexing of tens of users, since the array gain and ability to separate users spatially lead to great spectral efficiency | Beamforming to a single user, which greatly improves the link budget and thereby extends coverage |
Channel characteristics | Rich multipath propagation | Only a few propagation paths |
Spectral efficiency and bandwidth | High spectral efficiency due to the spatial multiplexing, but small bandwidth | Low spectral efficiency due to few users, large pathloss, and large noise power, but large bandwidth |
Transceiver hardware | Fully digital transceiver implementations are feasible and have been prototyped | Hybrid analog-digital transceiver implementations are needed, at least in the first products |
Since Massive MIMO was initially proposed by Tom Marzetta for sub-6 GHz applications, I personally recommend to use the “Massive MIMO” name only for that use case. One can instead say “mmWave Massive MIMO” or just “mmWave” when referring to multi-antenna technologies for mmWave bands.
]]>Prof. Erik. G. Larsson gave a 2.5 hour tutorial on the fundamentals of Massive MIMO, which is highly recommended for anyone learning this topic. You can then follow up by reading his book with the same topic.
When you have viewed Erik’s introduction, you can learn more about the state-of-the-art signal processing schemes for Massive MIMO from another talk at the summer school. Dr. Emil Björnson gave a 3 hour tutorial on this topic:
]]>One option is to let the signal power become times larger than in a single-antenna reference scenario. The increase in SNR will then lead to higher data rates for the users. The gain can be anything from bit/s/Hz to almost negligible, depending on how interference-limited the system is. Another option is to utilize the array gain to reduce the transmit power, to maintain the same SNR as in the reference scenario. The corresponding power saving can be very helpful to improve the energy efficiency of the system.
In the uplink, with single-antenna user terminals, we can choose between these options. However, in the downlink, we might not have a choice. There are strict regulations on the permitted level of out-of-band radiation in practical systems. Since Massive MIMO uses downlink precoding, the transmitted signals from the base station have a stronger directivity than in the single-antenna reference scenario. The signal components that leak into the bands adjacent to the intended frequency band will then also be more directive.
For example, consider a line-of-sight scenario where the precoding creates an angular beam towards the intended user (as illustrated in the figure below). The out-of-band radiation will then get a similar angular directivity and lead to larger interference to systems operating in adjacent bands, if their receivers are close to the user (as the victim in the figure below). To counteract this effect, our only choice might be to reduce the downlink transmit power to keep the worst-case out-of-band radiation constant.
Another alternative is that the regulations are made more flexible with respect to precoded transmissions. The probability that a receiver in an adjacent band is hit by an interfering out-of-band beam, such that the interference becomes times larger than in the reference scenario, reduces with an increasing number of antennas since the beams are narrower. Hence, if one can allow for beamformed out-of-band interference if it occurs with sufficiently low probability, the array gain in Massive MIMO can still be utilized to increase the SNRs. A third option will then be to (partially) reduce the transmit power to also allow for relaxed linearity requirements of the hardware.
These considerations are nicely discussed in an overview article that appeared on ArXiv earlier this year. There are also two papers that analyze the impact of out-of-bound radiation in Massive MIMO: Paper 1 and Paper 2.
]]>Asymptotic analysis is a popular tool within statistical signal processing (infinite SNR or number of samples), information theory (infinitely long blocks) and more recently, [massive] MIMO wireless communications (infinitely many antennas).
Some caution is strongly advisable with respect to the latter. In fact, there are compelling reasons to avoid asymptotics in the number of antennas altogether:
Finally, and perhaps most importantly, careless use of asymptotic arguments may yield erroneous conclusions. For example in the effective SINRs in multi-cell Massive MIMO, the coherent interference scales with M (number of antennas) – which yields the commonly held misconception that coherent interference is the main impairment caused by pilot contamination. But in fact, in many relevant circumstances it is not (see case studies here): the main impairment for “reasonable” values of M is the reduction in coherent beamforming gain due to reduced estimation quality, which in turn is independent of M.
In addition, the number of antennas beyond which the far-field assumption is violated is actually smaller than what one might first think (problem 3.14).
]]>Many researchers have analyzed pilot contamination over the six years that have passed since Marzetta uncovered its importance in Massive MIMO systems. We now have a quite good understanding of how to mitigate pilot contamination. There is a plethora of different approaches, whereof many have complementary benefits. If pilot contamination is not mitigated, it will both reduce the array gain and create coherent interference. Some approaches mitigate the pilot interference in the channel estimation phase, while some approaches combat the coherent interference caused by pilot contamination. In this post, I will try to categorize the approaches and point to some key references.
Interference-rejecting precoding and combining
Pilot contamination makes the estimate of a desired channel correlated with the channel from pilot-sharing users in other cells. When these channel estimates are used for receive combining or transmit precoding, coherent interference typically arise. This is particularly the case if maximum ratio processing is used, because it ignores the interference. If multi-cell MMSE processing is used instead, the coherent interference is rejected in the spatial domain. In particular, recent work from Björnson et al. (see also this related paper) have shown that there is no asymptotic rate limit when using this approach, if there is just a tiny amount of spatial correlation in the channels.
Data-aided channel estimation
Another approach is to “decontaminate” the channel estimates from pilot contamination, by using the pilot sequence and the uplink data for joint channel estimation. This have the potential of both improving the estimation quality (leading to a stronger desired signal) and reducing the coherent interference. Ideally, if the data is known, data-aided channel estimation increase the length of the pilot sequences to the length of the uplink transmission block. Since the data is unknown to the receiver, semi-blind estimation techniques are needed to obtain the channel estimates. Ngo et al. and Müller et al. did early works on pilot decontamination for Massive MIMO. Recent work has proved that one can fully decontaminate the estimates, as the length of the uplink block grows large, but it remains to find the most efficient semi-blind decontamination approach for practical block lengths.
Pilot assignment and dimensioning
Which subset of users that share a pilot sequence makes a large difference, since users with large pathloss differences and different spatial channel correlation cause less contamination to each other. Recall that higher estimation quality both increases the gain of the desired signal and reduces the coherent interference. Increasing the number of orthogonal pilot sequences is a straightforward way to decrease the contamination, since each pilot can be assigned to fewer users in the network. The price to pay is a larger pilot overhead, but it seems that a reuse factor of 3 or 4 is often suitable from a sum rate perspective in cellular networks. The joint spatial division and multiplexing (JSDM) provides a basic methodology to take spatial correlation into account in the pilot reuse patterns.
Alternatively, pilot sequences can be superimposed on the data sequences, which gives as many orthogonal pilot sequences as the length of the uplink block and thereby reduces the pilot contamination. This approach also removes the pilot overhead, but it comes at the cost of causing interference between pilot and data transmissions. It is therefore important to assign the right fraction of power to pilots and data. A hybrid pilot solution, where some users have superimposed pilots and some have conventional pilots, may bring the best of both worlds.
If two cells use the same subset of pilots, the exact pilot-user assignment can make a large difference. Cell-center users are generally less sensitive to pilot contamination than cell-edge users, but finding the best assignment is a hard combinatorial problem. There are heuristic algorithms that can be used and also an optimization framework that can be used to evaluate such algorithms.
Multi-cell cooperation
A combination of network MIMO and macro diversity can be utilized to turn the coherent interference into desired signals. This approach is called pilot contamination precoding by Ashikhmin et al. and can be applied in both uplink and downlink. In the uplink, the base stations receive different linear combinations of the user signals. After maximum ratio combining, the coefficients in the linear combinations approach deterministic numbers as the number of antennas grow large. These numbers are only non-zero for the pilot-sharing users. Since the macro diversity naturally creates different linear combinations, the base stations can jointly solve a linear system of equations to obtain the transmitted signals. In the downlink, all signals are sent from all base stations and are precoded in such a way that the coherent interference sent from different base stations cancel out. While this is a beautiful approach for mitigating the coherent interference, it relies heavily on channel hardening, favorable propagation, and i.i.d. Rayleigh fading. It remains to be shown if the approach can provide performance gains under more practical conditions.
]]>The important fact is that ergodic capacity can be lower-bounded by a formula of the form log2(1+SINR), where SINR is an “effective SINR” (that includes, among others, the effects of the terminal’s lack of channel knowledge).
This effective SINR scales proportionally to M (number of antennas), for fixed total radiated power. Compared to a single-antenna system, reciprocity always offers M times better “beamforming gain” regardless of the system’s operating point. (In fact one of the paradoxes of Massive MIMO is that performance always increases with M, despite the fact that there are “more unknowns to estimate”!) And yes, at very low SNR, the effective SINR is proportional to SNR^2 so reciprocity-based beamforming does “break down”, however, it is still M times better than a single-antenna link (with the same total radiated power). One will also, eventually, reach a point where the capacity bound for omnidirectional transmission (e.g. using a space-time code with appropriate dimension reduction in order to host the required downlink pilots) exceeds that of reciprocity-based beamforming, however, importantly, in this regime the bounds may be loose.
These matters, along with numerous case studies involving actual link budget calculations, are of course rigorously explained in our recent textbook.
]]>Have you reflected over what the purpose of asymptotic analysis is? The goal is not that we should design and deploy wireless networks with a nearly infinite number of antennas. Firstly, it is physically impossible to do that in a finite-sized world, irrespective of whether you let the array aperture grow or pack the antennas more densely. Secondly, the conventional channel models break down, since you will eventually receive more power than you transmitted. Thirdly, the technology will neither be cost nor energy efficient, since the cost/energy grows linearly with , while the delivered system performance either approaches a finite limit or grows logarithmically with .
It is important not to overemphasize the implications of asymptotic results. Consider the popular power-scaling law which says that one can use the array gain of Massive MIMO to reduce the transmit power as and still approach a non-zero asymptotic rate limit. This type of scaling law has been derived for many different scenarios in different papers. The practical implication is that you can reduce the transmit power as you add more antennas, but the asymptotic scaling law does not prescribe how much you should reduce the power when going from, say, 40 to 400 antennas. It all depends on which rates you want to deliver to your users.
The figure below shows the transmit power in a scenario where we start with 1 W for a single-antenna transmitter and then follow the asymptotic power-scaling law as the number of antennas increases. With antennas, the transmit power per antenna is just 1 mW, which is unnecessarily low given the fact that the circuits in the corresponding transceiver chain will consume much more power. By using higher transmit power than 1 mW per antenna, we can deliver higher rates to the users, while barely effecting the total power of the base station.
Similarly, there is a hardware-scaling law which says that one can increase the error vector magnitude (EVM) proportionally to and approach a non-zero asymptotic rate limit. The practical implication is that Massive MIMO systems can use simpler hardware components (that cause more distortion) than conventional systems, since there is a lower sensitivity to distortion. This is the foundation on which the recent works on low-bit ADC resolutions builds (see this paper and references therein).
Even the importance of the coherent interference, caused by pilot contamination, is easily overemphasized if one only considers the asymptotic behavior. For example, the finite rate limit that appears when communicating over i.i.d. Rayleigh fading channels with maximum ratio or zero-forcing processing is only approached in practice if one has around one million antennas.
In my opinion, the purpose of asymptotic analysis is not to understand the asymptotic behaviors themselves, but what the asymptotics can tell us about the performance at practical number of antennas. Here are some usages that I think are particularly sound:
Some form of Massive MIMO will appear in 5G, but to get a well-designed system we need to focus more on demonstrating and optimizing the performance in practical scenarios (e.g., the key 5G use cases) and less on pure asymptotic analysis.
]]>With i.i.d. Rayleigh fading, the channel gain has an Erlang-distribution (this is a scaled distribution) and the channel direction is uniformly distributed over the unit sphere in . The channel gain and the channel direction are also independent random variables, which is why this is a spatially uncorrelated channel model.
One of the key benefits of i.i.d. Rayleigh fading is that one can compute closed-form rate expressions, at least when using maximum ratio or zero-forcing processing; see Fundamentals of Massive MIMO for details. These expressions have an intuitive interpretation, but should be treated with care because practical channels are not spatially uncorrelated. Firstly, due to the propagation environment, the channel vector is more probable to point in some directions than in others. Secondly, the antennas have spatially dependent antenna patterns. Both factors contribute to the fact that spatial channel correlation always appears in practice.
One of the basic properties of spatial channel correlation is that the base station array receives different average signal power from different spatial directions. This is illustrated in Figure 1 below for a uniform linear array with 100 antennas, where the angle of arrival is measured from the boresight of the array.
As seen from Figure 1, with i.i.d. Rayleigh fading the average received power is equally large from all directions, while with spatially correlated fading it varies depending on in which direction the base station applies its receive beamforming. Note that this is a numerical example that was generated by letting the signal come from four scattering clusters located in different angular directions. Channel measurements from Lund University (see Figure 4 in this paper) show how the spatial correlation behaves in practical scenarios.
Correlated Rayleigh fading is a tractable way to model a spatially correlation channel vector: , where the covariance matrix is also the correlation matrix. It is only when is a scaled identity matrix that we have spatially uncorrelated fading. The eigenvalue distribution determines how strongly spatially correlated the channel is. If all eigenvalues are identical, then is a scaled identity matrix and there is no spatial correlation. If there are a few strong eigenvalues that contain most of the power, then there is very strong spatial correlation and the channel vector is very likely to be (approximately) spanned by the corresponding eigenvectors. This is illustrated in Figure 2 below, for the same scenario as in the previous figure. In the considered correlated fading case, there are 20 eigenvalues that are larger than in the i.i.d. fading case. These eigenvalues contain 94% of the power, while the next 20 eigenvalues contain 5% and the smallest 60 eigenvalues only contain 1%. Hence, most of the power is concentrated to a subspace of dimension . The fraction of strong eigenvalues is related to the fraction of the angular interval from which strong signals are received. This relation can be made explicit in special cases.
One example of spatially correlated fading is when the correlation matrix has equal diagonal elements and non-zero off-diagonal elements, which describe the correlation between the channel coefficients of different antennas. This is a reasonable model when deploying a compact base station array in tower. Another example is a diagonal correlation matrix with different diagonal elements. This is a reasonable model when deploying distributed antennas, as in the case of cell-free Massive MIMO.
Finally, a more general channel model is correlated Rician fading: , where the mean value represents the deterministic line-of-sight channel and the covariance matrix determines the properties of the fading. The correlation matrix can still be used to determine the spatial correlation of the received signal power. However, from a system performance perspective, the fraction between the power of the line-of-sight path and the scattered paths can have a large impact on the performance as well. A nearly deterministic channel with a large -factor provide more reliable communication, in particular since under correlated fading it is only the large eigenvalues of that contributes to the channel hardening (which otherwise provides reliability in Massive MIMO).
]]>Looking back, I am always wondering where the term “Massive MIMO” actually comes from. When we wrote our paper, the terms “large-scale antenna systems (LSAS)” or simply “large-scale MIMO” were commonly used to refer to base stations with very large antenna arrays, and I do not recall what made us choose our title.
The Google Trends Chart for “Massive MIMO” above clearly shows that interest in this topic started roughly at the time Tom Marzetta’s seminal paper was published, although the term itself does not appear in it at all. If anyone has an idea or reference where the term “Massive MIMO” was first used, please feel free to write this in the comment field.
In case you have not read our paper, let me first explain the key question it tries to answer. Marzetta showed in his paper that the simplest form of linear receive combining and transmit precoding, namely maximum ratio combining (MRC) and transmission (MRT), respectively, achieve an asymptotic spectral efficiency (when the number of antennas goes to infinity) that is only limited by coherent interference caused by user equipments (UEs) using the same pilot sequences for channel training (see the previous blog post on pilot contamination). All non-coherent interference such as noise, channel gain uncertainty due to estimation errors, and interference magically vanishes thanks to the strong law of large numbers and favorable propagation. Intrigued by this beautiful result, we wanted to know what happens for a large but finite number of antennas . Clearly, MRC/MRT are not optimal in this regime, and we wanted to quantify how much can be gained by using more advanced combining/precoding schemes. In other words, our goal was to figure out how many antennas could be “saved” by computing a matrix inverse, which is the key ingredient of the more sophisticated schemes, such as MMSE combining or regularized zero-forcing (RZF) precoding. Moreover, we wanted to compute how much of the asymptotic spectral efficiency can be achieved with antennas. Please read our paper if you are interested in our findings.
What is interesting to notice is that we (and many other researchers) had always taken the following facts about Massive MIMO for granted and repeated them in numerous papers without further questioning:
We have recently uploaded a new paper on Arxiv which proves that all of these “facts” are incorrect and essentially artifacts from using simplistic channel models and suboptimal precoding/combining schemes. What I find particularly amusing is that we have come to this result by carefully analyzing the asymptotic performance of the multicell MMSE receive combiner that I mentioned but rejected in the 2011 Allerton paper. To understand the difference between the widely used single-cell MMSE (S-MMSE) combining and the (not widely used) multicell MMSE (M-MMSE) combining, let us look at their respective definitions for a base station located in cell :
where and denote the number of cells and UEs per cell, is the estimated channel matrix from the UEs in cell , and and are the covariance matrices of the channel and the channel estimation errors of UE in cell , respectively. While M-MMSE combining uses estimates of the channels from all UEs in all cells, the simpler S-MMSE combining uses only channel estimates from the UEs in the own cell. Importantly, we show that Massive MIMO with M-MMSE combining has unlimited capacity while Massive MIMO with S-MMSE combining has not! This behavior is shown in the following figure:
In the light of this new result, I wish that we would not have made the following remark in our 2011 Allerton paper:
“Note that a BS could theoretically estimate
all channel matrices (…) to further
improve the performance. Nevertheless, high path loss to
neighboring cells is likely to render these channel estimates unreliable and the potential performance gains are expected to be marginal.”
We could not have been more wrong about it!
In summary, although we did not understand the importance of M-MMSE combining in 2011, I believe that we were asking the right questions. In particular, the consideration of individual channel covariance matrices for each UE has been an important step for the analysis of Massive MIMO systems. A key lesson that I have learned from this story for my own research is that one should always question fundamental assumptions and wisdom.
]]>I also touched the (for sub-5 GHz bands somewhat controversial) topic of hybrid beamforming, and whether that would reduce the required amount of hardware.
A question from the audience was whether the use of antennas with larger physical aperture (i.e., intrinsic directivity) would change the conclusions. The answer is no: the use of directional antennas is more or less equivalent to sectorization. The problem is that to exploit the intrinsic gain, the antennas must a priori point “in the right direction”. Hence, in the array, only a subset of the antennas will be useful when serving a particular terminal. This impacts both the channel gain (reduced effective aperture) and orthogonality (see, e.g, Figure 7.5 in this book).
There was also a stimulating panel discussion afterwards. One question discussed in the panel concerned the necessity, or desirability, of using multiple terminal antennas at mmWave. Looking only at the link budget, base station antennas could be traded against terminal antennas – however, that argument neglects the inevitably lost orthogonality, and furthermore it is not obvious how beam-finding/tracking algorithms will perform (millisecond coherence time at pedestrian speeds!). Also, obviously, the comparison I presented is extremely simplistic – to begin with, the line-of-sight scenario is extremely favorable for mmWaves (blocking problems), but also, I entirely neglected polarization losses. Solely any attempts to compensate for these problems are likely to require multiple terminal antennas.
Other topics touched in the panel were the viability of Massive MIMO implementations. Perhaps the most important comment in this context made was by Ian Wong of National Instruments: “In the past year, we’ve actually shown that [massive MIMO] works in reality … To me, the biggest development is that the skeptics are being quiet.” (Read more about that here.)
]]>While it is known that grid-of-beams solutions perform poorly in isotropic scattering, no prior experimental results are known. This new paper:
Massive MIMO Performance—TDD Versus FDD: What Do Measurements Say?
answers this performance question through the analysis of real Massive MIMO channel measurement data obtained at the 2.6 GHz band. Except for in certain line-of-sight (LOS) environments, the original reciprocity-based TDD Massive MIMO represents the only effective implementation of Massive MIMO at the frequency bands under consideration.
]]>The basic presumption of TDD/reciprocity-based Massive MIMO is that all activity, comprising the transmission of uplink pilots, uplink data and downlink data, takes place inside of a coherence interval:
At fixed mobility, in meter/second, the dimensionality of the coherence interval is proportional to the wavelength, because the Doppler spread is proportional to the carrier frequency.
In a single cell, with max-min fairness power control (for uniform quality-of-service provision), the sum-throughput of Massive MIMO can be computed analytically and is given by the following formula:
In this formula,
This formula assumes independent Rayleigh fading, but the general conclusions remain under other models.
The factor that pre-multiplies the logarithm depends on .
The pre-log factor is maximized when . The maximal value is , which is proportional to , and therefore proportional to the wavelength. Due to the multiplication $B T_c$, one can get same pre-log factor using a smaller bandwidth by instead increasing the wavelength, i.e., reducing the carrier frequency. At the same time, assuming appropriate scaling of the number of antennas, , with the number of terminals, , the quantity inside of the logarithm is a constant.
Concluding, the sum spectral efficiency (in b/s/Hz) easily can double for every doubling of the wavelength: a megahertz of bandwidth at 100 MHz carrier is ten times more worth than a megahertz of bandwidth at a 1 GHz carrier. So while there is more bandwidth available at higher carriers, the potential multiplexing gains are correspondingly smaller.
In this example,
all three setups give the same sum-throughput, however, the throughput per terminal is vastly different.
It is the physics that make it difficult to provide good coverage. The transmitted signals spread out and only a tiny fraction of the transmitted power reaches the receive antenna (e.g., one part of a billion parts). In cellular networks, the received signal power reduces roughly as the propagation distance to the power of four. This results in the following data rate coverage behavior:
This figure considers an area covered by nine base stations, which are located at the middle of the nine peaks. Users that are close to one of the base stations receive the maximum downlink data rate, which in this case is 60 Mbit/s (e.g., spectral efficiency 6 bit/s/Hz over a 10 MHz channel). As a user moves away from a base station, the data rate drops rapidly. At the cell edge, where the user is equally distant from multiple base stations, the rate is nearly zero in this simulation. This is because the received signal power is low as compared to the receiver noise.
What can be done to improve the coverage?
One possibility is to increase the transmit power. This is mathematically equivalent to densifying the network, so that the area covered by each base station is smaller. The figure below shows what happens if we use 100 times more transmit power:
There are some visible differences as compared to Figure 1. First, the region around the base station that gives 60 Mbit/s is larger. Second, the data rates at the cell edge are slightly improved, but there are still large variations within the area. However, it is no longer the noise that limits the cell-edge rates—it is the interference from other base stations.
The inter-cell interference remains even if we would further increase the transmit power. The reason is that the desired signal power as well as the interfering signal power grow in the same manner at the cell edge. Similar things happen if we densify the network by adding more base stations, as nicely explained in a recent paper by Andrews et al.
Ideally, we would like to increase only the power of the desired signals, while keeping the interference power fixed. This is what transmit precoding from a multi-antenna array can achieve; the transmitted signals from the multiple antennas at the base station add constructively only at the spatial location of the desired user. More precisely, the signal power is proportional to M (the number of antennas), while the interference power caused to other users is independent of M. The following figure shows the data rates when we go from 1 to 100 antennas:
Figure 3 shows that the data rates are increased for all users, but particularly for those at the cell edge. In this simulation, everyone is now guaranteed a minimum data rate of 30 Mbit/s, while 60 Mbit/s is delivered in a large fraction of the coverage area.
In practice, the propagation losses are not only distant-dependent, but also affected by other large-scale effects, such as shadowing. The properties described above remain nevertheless. Coherent precoding from a base station with many antennas can greatly improve the data rates for the cell edge users, since only the desired signal power (and not the interference power) is increased. Higher transmit power or smaller cells will only lead to an interference-limited regime where the cell-edge performance remains to be poor. A practical challenge with coherent precoding is that the base station needs to learn the user channels, but reciprocity-based Massive MIMO provides a scalable solution to that. That is why Massive MIMO is the key technology for delivering ubiquitous connectivity in 5G.
]]>To look into this, consider a communication system operating over a bandwidth of Hz. By assuming an additive white Gaussian noise channel, the capacity becomes
where W is the transmit power, is the channel gain, and W/Hz is the power spectral density of the noise. The term inside the logarithm is referred to as the signal-to-noise ratio (SNR).
Since the bandwidth appears in front of the logarithm, it might seem that the capacity grows linearly with the bandwidth. This is not the case since also the noise term in the SNR also grows linearly with the bandwidth. This fact is illustrated by Figure 1 below, where we consider a system that achieves an SNR of 0 dB at a reference bandwidth of 20 MHz. As we increase the bandwidth towards 2 GHz, the capacity grows only modestly. Despite the 100 times more bandwidth, the capacity only improves by , which is far from the that a linear increase would give.
The reason for this modest capacity growth is the fact that the SNR reduces inversely proportional to the bandwidth. One can show that
The convergence to this limit is seen in Figure 1 and is relatively fast since for .
To achieve a linear capacity growth, we need to keep the SNR fixed as the bandwidth increases. This can be achieved by increasing the transmit power proportionally to the bandwidth, which entails using more power when operating over a wider bandwidth. This might not be desirable in practice, at least not for battery-powered devices.
An alternative is to use beamforming to improve the channel gain. In a Massive MIMO system, the effective channel gain is , where is the number of antennas and is the gain of a single-antenna channel. Hence, we can increase the number of antennas proportionally to the bandwidth to keep the SNR fixed.
Figure 2 considers the same setup as in Figure 1, but now we also let either the transmit power or the number of antennas grow proportionally to the bandwidth. In both cases, we achieve a capacity that grows proportionally to the bandwidth, as we initially hoped for.
In conclusion, to make efficient use of more bandwidth we require more transmit power or more antennas at the transmitter and/or receiver. It is worth noting that these requirements are purely due to the increase in bandwidth. In addition, for any given bandwidth, the operation at millimeter-wave frequencies requires much more transmit power and/or more antennas (e.g., additional constant-gain antennas or one constant-aperture antenna) just to achieve the same SNR as in a system operating at conventional frequencies below 5 GHz.
]]>The diversity achieved by sending a signal over multiple channels with independent realizations is key to combating small-scale fading. Spatial diversity is particularly attractive, since it can be obtained by simply having multiple antennas at the transmitter or the receiver. Suppose the probability of a bad channel gain realization is p. If we have M antennas with independent channel gains, then the risk that all of them are bad is p^{M}. For example, with p=0.1, there is a 10% risk of getting a bad channel in a single-antenna system and a 0.000001% risk in an 8-antenna system. This shows that just a few antennas can be sufficient to greatly improve reliability.
In Massive MIMO systems, with a “massive” number of antennas at the base station, the spatial diversity also leads to something called “channel hardening”. This terminology was used already in a paper from 2004:
M. Hochwald, T. L. Marzetta, and V. Tarokh, “Multiple-antenna channel hardening and its implications for rate feedback and scheduling,” IEEE Transactions on Information Theory, vol. 50, no. 9, pp. 1893–1909, 2004.
In short, channel hardening means that a fading channel behaves as if it was a non-fading channel. The randomness is still there but its impact on the communication is negligible. In the 2004 paper, the hardening is measured by dividing the instantaneous supported data rate with the fading-averaged data rate. If the relative fluctuations are small, then the channel has hardened.
Since Massive MIMO systems contain random interference, it is usually the hardening of the channel that the desired signal propagates over that is studied. If the channel is described by a random M-dimensional vector h, then the ratio ||h||^{2}/E{||h||^{2}} between the instantaneous channel gain and its average is considered. If the fluctuations of the ratio are small, then there is channel hardening. With an independent Rayleigh fading channel, the variance of the ratio reduces with the number of antennas as 1/M. The intuition is that the channel fluctuations average out over the antennas. A detailed analysis is available in a recent paper.
The figure above shows how the variance of ||h||^{2}/E{||h||^{2}} decays with the number of antennas. The convergence towards zero is gradual and so is the channel hardening effect. I personally think that you need at least M=50 to truly benefit from channel hardening.
Channel hardening has several practical implications. One is the improved reliability of having a nearly deterministic channel, which results in lower latency. Another is the lack of scheduling diversity; that is, one cannot schedule users when their ||h||^{2} are unusually large, since the fluctuations are small. There is also little to gain from estimating the current realization of ||h||^{2}, since it is relatively close to its average value. This can alleviate the need for downlink pilots in Massive MIMO.
]]>A good system design definitely must not ignore pilot interference. While it is easily removed “on the average” through greater-than-one reuse, the randomness present in wireless communications – especially the shadow fading – will occasionally cause a few terminals to be severely hit by pilot contamination and bring down their performance. This is problematic whenever we are concerned about the provision of uniformly great service in the cell – and that is one of the principal selling arguments for Massive MIMO. Notwithstanding, the impact of pilot contamination can be reduced significantly in practice by appropriate pilot reuse and judicious power control. (Chapters 5-6 in Fundamentals of Massive MIMO gives many details.)
A more fundamental question is whether pilot contamination could be entirely overcome: Does there exist an upper bound on capacity that saturates as the number of antennas, M, is increased indefinitely? Some have speculated that it cannot; much in line with known capacity upper bounds for cellular base station cooperation. While this question may be of more academic than practical interest, it has long been open except for in some trivial special cases: If the channels of two terminals lie in non-overlapping subspaces and Bayesian channel estimation is used, the channel estimates will not be contaminated; capacity grows as log(M) when M increases without bound.
A much deeper result is established in this recent paper: the subspaces of the channel covariances may overlap, yet capacity grows as log(M). Technically, a Rayleigh fading with spatial correlation is assumed, and the correlation matrices for the contaminating terminals must only be linearly independent as M goes to infinity (exact conditions in the paper). In retrospect, this is not unreasonable given the substantial a priori knowledge exploited by the Bayesian channel estimator, but I found it amazing how weak the required conditions on the correlation matrices are. It remains unclear whether the result generalizes to the case of a growing number of interferers: letting the number of antennas go to infinity and then growing the network is not the same thing as taking an “infinite” (scalable) network and increasing the number of antennas. But this paper elegantly and rigorously answers a long-standing question that has been the subject of much debate in the community – and is a recommended read for anyone interested in the fundamental limits of Massive MIMO.
]]>The base station wants to know the channel responses of its user terminals and these are estimated in the uplink by sending pilot signals. Each pilot signal is corrupted by inter-cell interference and noise when received at the base station. For example, consider the scenario illustrated below where two terminals are transmitting simultaneously, so that the base station receives a superposition of their signals—that is, the desired pilot signal is contaminated.
When estimating the channel from the desired terminal, the base station cannot easily separate the signals from the two terminals. This has two key implications:
First, the interfering signal acts as colored noise that reduces the channel estimation accuracy.
Second, the base station unintentionally estimates a superposition of the channel from the desired terminal and from the interferer. Later, the desired terminal sends payload data and the base station wishes to coherently combine the received signal, using the channel estimate. It will then unintentionally and coherently combine part of the interfering signal as well. This is particularly poisonous when the base station has M antennas, since the array gain from the receive combining increases both the signal power and the interference power proportionally to M. Similarly, when the base station transmits a beamformed downlink signal towards its terminal, it will unintentionally direct some of the signal towards to interferer. This is illustrated below.
In the academic literature, pilot contamination is often studied under the assumption that the interfering terminal sends the same pilot signal as the desired terminal, but in practice any non-orthogonal interfering signal will cause the two effects described above.
]]>Future wireless networks have to manage at the same time billions of devices; each needs a high throughput to support many applications such as voice, real-time video, high quality movies, etc. Cellular networks could not handle such huge connections since user terminals at the cell boundary suffer from very high interference, and hence, perform badly. Furthermore, conventional cellular systems are designed mainly for human users. In future wireless networks, machine-type communications such as the Internet of Things, Internet of Everything, Smart X, etc. are expected to play an important role. The main challenge of machine-type communications is scalable and efficient connectivity for billions of devices. Centralized technology with cellular topologies does not seem to be working for such scenarios since each cell can cover a limited number of user terminals. So why not cell-free structures with decentralized technology? Of course, to serve many user terminals and to simplify the signal processing in a distributed manner, massive MIMO technology should be included. The combination between cell-free structure and massive MIMO technology yields the new concept: Cell-Free Massive MIMO.
What is Cell-Free Massive MIMO? Cell-Free Massive MIMO is a system where a massive number access points distributed over a large area coherently serve a massive number of user terminals in the same time/frequency band. Cell-Free Massive MIMO focuses on cellular frequencies. However, millimeter wave bands can be used as a combination with the cellular frequency bands. There are no concepts of cells or cell boundaries here. Of course, specific signal processing is used, see [1] for more details. Cell-Free Massive MIMO is a new concept. It is a new practical, useful, and scalable version of network MIMO (or cooperative multipoint joint processing) [2, 3]. To some extent, Massive MIMO technology based on the favorable propagation and channel hardening properties is used in Cell-Free Massive MIMO.
Cell-Free Massive MIMO is different from distributed Massive MIMO [4]. Both systems use many service antennas in a distributed way to serve many user terminals, but they are not entirely the same. With distributed Massive MIMO, the base station antennas are distributed within each cell, and these antennas only serve user terminals within that cell. By contrast, in Cell-Free Massive MIMO there are no cells. All service antennas coherently serve all user terminals. The figure below compares the structures of Cell-Free Massive MIMO and distributed Massive MIMO.
Distributed Massive MIMO | Cell-Free Massive MIMO |
[1] H. Q. Ngo, A. Ashikhmin, H. Yang, E. G. Larsson, and T. L. Marzetta, “Cell-Free Massive MIMO versus Small Cells,” IEEE Trans. Wireless Commun., 2016 submitted for publication. Available: https://arxiv.org/abs/1602.08232
[2] G. Foschini, K. Karakayali, and R. A. Valenzuela, “Coordinating multiple antenna cellular networks to achieve enormous spectral efficiency,” IEE Proc. Commun. , vol. 152, pp. 548–555, Aug. 2006.
[3] E. Björnson, R. Zakhour, D. Gesbert, B. Ottersten, “Cooperative Multicell Precoding: Rate Region Characterization and Distributed Strategies with Instantaneous and Statistical CSI,” IEEE Trans. Signal Process., vol. 58, no. 8, pp. 4298-4310, Aug. 2010.
[4] K. T. Truong and R.W. Heath Jr., “The viability of distributed antennas for massive MIMO systems,” in Proc. Asilomar CSSC, 2013, pp. 1318–1323.
]]>“Massive MIMO is a useful and scalable version of Multiuser MIMO. There are three fundamental distinctions between Massive MIMO and conventional Multiuser MIMO. First, only the base station learns G. Second, M is typically much larger than K, although this does not have to be the case. Third, simple linear signal processing is used both on the uplink and on the downlink. These features render Massive MIMO scalable with respect to the number of base station antennas, M.”
(Note: M is the number of antennas, K is the number of users, and G denotes the channel matrix).
In [2], we find another definition:
“Massive MIMO is a multi-user MIMO system with M antennas and K users per BS. The system is characterized by M ≫ K and operates in TDD mode using linear uplink and downlink processing.”
Both are nice general definitions that cover most systems that commonly are called “Massive MIMO”. However, their generality also makes them vague and they fail to pinpoint the essence of Massive MIMO. Here, is my take on a slightly more precise definition:
“Massive MIMO is a multi-user MIMO system that (1) serves multiple users through spatial multiplexing over a channel with favorable propagation in time-division duplex and (2) relies on channel reciprocity and uplink pilots to obtain channel state information.”
Now, you might ask: So what is then “favorable propagation”? We need a second definition:
“The propagation is said to be favorable when users are mutually orthogonal in some practical sense.”
Again you ask: in what practical sense? If h∈ℂᴹ is the channel vector to one user and g∈ℂᴹ the channel vector to another, the users are said to be orthogonal if hᴴg = 0. Unfortunately, this is never true in a real system. It can be practically true, however, if we say that users are practically orthogonal when hᴴg/(‖h‖‖g‖) has mean zero and a variance that is much smaller than one.
There we go: a more-or-less rigorous definition of Massive MIMO. Note that this definition does not require the number of users to be small in any sense. So, to the big question: How many antennas does a base station need to be “massive”? The answer is given for the i.i.d. Rayleigh fading channel in the following curve that shows how the users’ channels become practically orthogonal as the number of antennas is increased.
In that application, I don’t think so. Here is why.
What ultimately limits Massive MIMO is mobility: no more than half of the coherence time-bandwidth product should be occupied by pilot transmission activities. (This is the “half and half rule”.) In macro-cellular at 3 GHz, with highway mobility we may have on the order of 200 kHz x 1 millisecond coherence; that is 200 samples. With pilot reuse of 3 (that practically does away with pilot contamination), we could, then ultimately learn the channel to some 30 simultaneously served terminals – assuming mutually orthogonal pilots. Once the number of base station antennas M reaches beyond twice this number, with some margin – say M=100, the spectral efficiency grows logarithmically with M. That means, even doubling M yields only a 3dB effective SINR increase, that is a single extra bit per second/Hz per terminal. Beyond M=100 or M=200, it may not be worth it. Multiple antennas are only truly useful if they are used to multiplex, and mobility limits the amount of multiplexing we can perform.
So why not quadruple the number of antennas for additional coverage? May not be worth it either. Going from M=200 to M=2000 gives 10 dB – that pays for a 75% range extension, or, alternatively, a tenth of the losses incurred by an energy-saving coated window glass.
In stationary environments, the story is different – a topic that we will be returning to.
]]>With spectral efficiency, we usually mean the sum spectral efficiency of the transmissions in a cell of a cellular network. It is measured in bit/s/Hz. If you multiply it with the bandwidth, you will get the cell throughput measured in bit/s. Since the bandwidth is a scarce resource, particularly at the frequencies below 5 GHz that are suitable for network coverage, it is highly desirable to improve the cell throughput by increasing the spectral efficiency rather than increasing the bandwidth.
A great way to improve the spectral efficiency is to simultaneously serve many user terminals in the cell, over the same bandwidth, by means of space division multiple access. This is where Massive MIMO is king. There is no doubt that this technology can improve the spectral efficiency. The question is rather “how much?”
Earlier this year, the joint experimental effort by the universities in Bristol and Lund demonstrated an impressive spectral efficiency of 145.6 bit/s/Hz, over a 20 MHz bandwidth in the 3.5 GHz band. The experiment was carried out in a single-cell indoor environment. Their huge spectral efficiency can be compared with 3 bit/s/Hz, which is the IMT Advanced requirement for 4G. The remarkable Massive MIMO gain was achieved by spatial multiplexing of data signals to 22 users using 256-QAM. The raw spectral efficiency is 176 bit/s/Hz, but 17% was lost for practical reasons. You can read more about this measurement campaign here:
http://www.bristol.ac.uk/news/2016/may/5g-wireless-spectrum-efficiency.html
256-QAM is generally not an option in cellular networks, due to the inter-cell interference and unfavorable cell edge conditions. Numerical simulations can, however, predict the practically achievable spectral efficiency. The figure below shows the uplink spectral efficiency for a base station with 200 antennas that serves a varying number of users. Interference from many tiers of neighboring cells is considered. Zero-forcing detection, pilot-based channel estimation, and power control that gives every user 0 dB SNR are assumed. Different curves are shown for different values of τ_{c}, which is the number of symbols per channel coherence interval. The curves have several peaks, since the results are optimized over different pilot reuse factors.
From this simulation figure we observe that the spectral efficiency grows linearly with the number of users, for the first 30-40 users. For larger user numbers, the spectral efficiency saturates due to interference and limited channel coherence. The top value of each curve is in the range from 60 to 110 bit/s/Hz, which are remarkable improvements over the 3 bit/s/Hz of IMT Advanced.
In conclusion, 20x-40x improvements in spectral efficiency over IMT Advanced are what to expect from Massive MIMO.
]]>suggest the use of 1-bit ADCs in Massive MIMO base station receivers. Important studies of a concept, that offers great potential for cost saving and simplification of transceiver hardware.
Granted, much lower resolution will be sufficient in Massive MIMO than in conventional MIMO, but will one bit be sufficient? These papers indicate that the price to pay is not insignificant: the number of antennas may have to be doubled in some cases. Also, while the use of symbol-sampled models as in these studies may give correct order-of-magnitude estimates of capacity, much future work appears to remain to understand the effects of digital channelization/prefiltering and sampling rate conversion if 1-bit frontends are going to be used.
]]>