Our 2014 massive MIMO tutorial paper won the IEEE ComSoc best tutorial paper award this year. The idea when writing that paper was to summarize the state of the technology, and to point out research directions that were relevant (at that time). It is of course, reassuring to see that many of those research directions evolved into entire sub-fields themselves in our community. Naturally, in the envisioning of these directions I also made some speculations.
It looks to me now that two of these speculations were wrong:
- First, “Massive MIMO increases the robustness against both unintended man-made interference and intentional jamming.” This is only true with some qualifiers, or possibly not true at all. (Actually I don’t really know, and I don’t think it is known for sure. It seems that this question remains a rather pertinent research direction for anyone interested in physical layer security and MIMO.) Subsequent research by others showed that Massive MIMO can be extraordinarily susceptible to attacks on the pilot channels, revealing an important, fundamental vulnerability at least if standard pilot-based channel estimation is used and no excess dimensions are “wasted” on interference suppression or detection. Basically this pilot channel attack exploits the so-called pilot contamination phenomenon, “hijacking” the reciprocity-based beamforming mechanism.
- Second, “In a way, massive MIMO relies on the law of large numbers to make sure that noise, fading, and hardware imperfections average out when signals from a large number of antennas are combined in the air.” This is not generally true, except for in-band distortion and with many simultaneously multiplexed users and frequency selective Rayleigh fading. In general the distortion that results from hardware imperfections is correlated among the antennas. In the special case of line-of-sight with a single terminal, an important basic reference case, the distortion is identical (up to a phase shift) at all antennas, hence resulting in a rank-one transmission: the distortion is beamformed in the same direction as the signal of interest and hardware imperfections do not “average out” at all.
This is particularly serious for out-band effects. Readers interested in a thorough mathematical treatment may consult my student’s recent Ph.D. dissertation.
Have you found any more? Let me know. The knowledge in the field continues to evolve.
One reason for why capacity lower bounds are so useful is that they are accurate proxies for link-level performance with modern coding. But this fact, well known to information and coding theorists, is often contested by practitioners. I will discuss some possible reasons for that here.
The recipe is to compute the capacity bound, and depending on the code blocklength, add a dB or a few, to the required SNR. That gives the link performance prediction. The coding literature is full of empirical results, showing how far from capacity a code of a given block length is for the AWGN channel, and this gap is usually not extremely different for other channel models – although, one should always check this.
But there are three main caveats with this:
- First, the capacity bound, or the “SINR” that it often contains, must be information-theoretically correct. A great deal of papers get this wrong. Emil explained in his blog post last week some common errors. The recommended approach is to map the channel onto one of the canonical cases in Figure 2.9 in Fundamentals of Massive MIMO, verify that the technical conditions are satisfied, and use the corresponding formula.
- When computing expressions of the type E[log(1+”SINR”)], then the average should be taken over all quantities that are random within the duration of a codeword. Typically, this means averaging over the randomness incurred by the noise, channel estimation errors, and in many cases the small-scale fading. All other parameters must be kept fixed. Typically, user positions, path losses, shadow fading, scheduling and pilot assignments, are fixed, so the expectation is conditional on those. (Yet, the interference statistics may vary substantially, if other users are dropping in and out of the system.) This in turn means that many “drops” have to be generated, where these parameters are drawn at random, and then CDF curves with respect to that second level of randomness needs be computed (numerically).Think of the expectation E[log(1+”SINR”)] as a “link simulation”. Every codeword sees many independent noise realizations, and typically small-scale fading realizations, but the same realization of the user positions. Also, often, neat (and tight) closed-form bounds on E[log(1+”SINR”)] are available.
- Care is advised when working with relatively short blocks (less than a few hundred bits) and at rates close to the constrained capacity with the foreseen modulation format. In this case, many of the “standard” capacity bounds become overoptimistic.As a rule of thumb, compare the capacity of an AWGN channel with the constrained capacity of the chosen modulation at the spectral efficiency of interest, and if the gap is small, the capacity bounds will be useful. If not, then reconsider the choice of modulation format! (See also homework problem 1.4.)
How far are the bounds from the actual capacity typically? Nobody knows, but there are good reasons to believe they are extremely close. Here (Figure 1) is a nice example that compares a decoder that uses the measured channel likelihood, instead of assuming a Gaussian (which is implied by the typical bounding techniques). From correspondence with one of the authors: “The dashed and solid lines are the lower bound obtained by Gaussianizing the interference, while the circles are the rate achievable by a decoder exploiting the non-Gaussianity of the interference, painfully computed through days-long Monte-Carlo. (This is not exactly the capacity, because the transmit signals here are Gaussian, so one could deviate from Gaussian signaling and possibly do slightly better — but the difference is imperceptible in all the experiments we’ve done.)”
Concerning Massive MIMO and its capacity bounds, I have met for a long time with arguments that these capacity formulas aren’t useful estimates of actual performance. But in fact, they are: In one simulation study we were less than one dB from the capacity bound by using QPSK and a standard LDPC code (albeit with fairly long blocks). This bound accounts for noise and channel estimation errors. Such examples are in Chapter 1 of Fundamentals of Massive MIMO, and also in the ten-myth paper:
(I wrote the simulation code, and can share it, in case anyone would want to reproduce the graphs.)
So in summary, while capacity bounds are sometimes done wrong; when done right they give pretty good estimates of actual link performance with modern coding.
(With thanks to Angel Lozano for discussions.)
I never thought it would happen so fast. When I started to work on Massive MIMO in 2009, the general view was that fully digital, phase-coherent operation of so many antennas would be infeasible, and that power consumption of digital and analog circuitry would prohibit implementations for the foreseeable future. More seriously, reservations were voiced that reciprocity-based beamforming would not work, or that operation in mobile conditions would be impossible.
These arguments, it turned out, all proved to be wrong. In 2017, Massive MIMO was the main physical-layer technology under standardization for 5G, and it is unlikely that any serious future cellular wireless communications system would not have Massive MIMO as a main technology component.
But Massive MIMO is more than a groundbreaking technology for wireless communications: it is also an elegant and mathematically rigorous approach to teaching wireless communications. In the moderately-large number-of-antennas regime, our closed-form capacity bounds become convenient proxies for the link performance achievable with practical coding and modulation.
These expressions take into account the effects of all significant physical phenomena: small-scale and large-scale fading, intra- and inter-cell interference, channel estimation errors, pilot reuse (also known as pilot contamination) and power control. A comprehensive analytical understanding of these phenomena simply has not been possible before, as the corresponding information theory has too complicated for any practical use.
The intended audiences of Fundamentals of Massive MIMO are engineers and students. I anticipate that as graduate courses on the topic become commonplace, our extensive problem set (with solutions) available online will serve as a useful resource to instructors. While other books and monographs will likely appear down the road, focusing on trendier and more recent research, Fundamentals of Massive MIMO distills the theory and facts that will prevail for the foreseeable future. This, I hope, will become its most lasting impact.
The concept of superimposed pilots is (at least 15 years) old, but clever and intriguing. The idea is to add pilot and data samples together, instead of separating them in time and/or frequency, before modulating with waveforms. More recently, the authors of this paper argued that in massive MIMO, based on certain simulations supported by asymptotic analysis, superimposed pilots provide superior performance and that there are strong reasons for superimposed pilots to make their way to practical use.
Until recently, a more rigorous analysis was unavailable. Some weeks ago the authors of this paper argued, that all things considered, the use of superimposed pilots does not offer any appreciable gains for practically interesting use cases. The analysis was based on a capacity-bounding approach for finite numbers of antennas and finite channel coherence, but it assumed the most basic form of signal processing for detection and decoding.
There still remains some hope of seeing improvements, by implementing more advanced signal processing, like zero-forcing, multicell MMSE decoding, or iterative decoding algorithms, perhaps involving “turbo” information exchange between the demodulator, channel estimation, and detector. It will be interesting to follow future work by these two groups of authors to understand how large improvements (if any) superimposed pilots eventually can give.
There are, at least, two general lessons to learn here. First, that performance predictions based on asymptotics can be misleading in practically relevant cases. (I have discussed this issue before.) The best way to perform analysis is to use rigorous capacity lower bounds, or possibly, in isolated cases of interest, link-level simulations with channel coding (for which, as it turns out, capacity bounds are a very good proxy). Second, more concretely, that while it may be tempting, to superimpose-squeeze multiple symbols into the same time-frequency-space resource, once all sources of impairments (channel estimation errors, interference) are accurately accounted for, the gains tend to evaporate. (It is for the same reason that NOMA offers no substantial gains in MIMO systems – a topic that I may return to at a later time.)
Yes, my group had its share of rejected papers as well. Here are some that I specially remember:
- Massive MIMO: 10 myths and one critical question. The first version was rejected by the IEEE Signal Processing Magazine. The main comment was that nobody would think that the points that we had phrased as myths were true. But in reality, each one of the myths was based on an actual misconception heard in public discussions! The paper was eventually published in the IEEE Communications Magazine instead in 2016, and has been cited more than 180 times.
- Massive MIMO with 1-bit ADCs. This paper was rejected by the IEEE Transactions on Wireless Communications. By no means a perfect paper… but the review comments were mostly nonsensical. The editor stated: “The concept as such is straightforward and the conceptual novelty of the manuscript is in that sense limited.” The other authors left my group shortly after the paper was written. I did not predict the hype on 1-bit ADCs for MIMO that would ensue (and this happened despite the fact that yes, the concept as such is straightforward and its conceptual novelty is rather limited!). Hence I didn’t prioritize a rewrite and resubmission. The paper was never published, but we put the rejected manuscript on arXiv in 2014, and it has been cited 80 times.
- Finally, a paper that was almost rejected upon its initial submission: Energy and Spectral Efficiency of Very Large Multiuser MIMO Systems, eventually published in the IEEE Transactions on Communications in 2013. The review comments included obvious nonsense, such as “Overall, there is not much difference in theory compared to what was studied in the area of MIMO for the last ten years.” The paper subsequently won the IEEE ComSoc Stephen O. Rice Prize, and has more than 1300 citations.
There are several lessons to learn here. First, that peer review may be the best system we know, but it isn’t perfect: disturbingly, it is often affected by incompetence and bias. Second, notwithstanding the first, that many paper rejections are probably also grounded in genuine misunderstandings: writing well takes a lot of experience, and a lot of hard, dedicated work. Finally, and perhaps most significantly, that persistence is really an essential component of success.
I am borrowing the title from a column written by my advisor two decades ago, in the array signal processing gold rush era.
Asymptotic analysis is a popular tool within statistical signal processing (infinite SNR or number of samples), information theory (infinitely long blocks) and more recently, [massive] MIMO wireless communications (infinitely many antennas).
Some caution is strongly advisable with respect to the latter. In fact, there are compelling reasons to avoid asymptotics in the number of antennas altogether:
- First, elegant, rigorous and intuitively comprehensible capacity bound formulas are available in closed form.
The proofs of these expressions use basic random matrix theory, but no asymptotics at all.
- Second, the notion of “asymptotic limit” or “asymptotic behavior” helps propagate the myth that Massive MIMO somehow relies on asymptotics or “infinite” numbers (or even exorbitantly large numbers) of antennas.
- Third, many approximate performance results for Massive MIMO (particularly “deterministic equivalents”) based on asymptotic analysis are complicated, require numerical evaluation, and offer little intuitive insight. (And, the verification of their accuracy is a formidable task.)
Finally, and perhaps most importantly, careless use of asymptotic arguments may yield erroneous conclusions. For example in the effective SINRs in multi-cell Massive MIMO, the coherent interference scales with M (number of antennas) – which yields the commonly held misconception that coherent interference is the main impairment caused by pilot contamination. But in fact, in many relevant circumstances it is not (see case studies here): the main impairment for “reasonable” values of M is the reduction in coherent beamforming gain due to reduced estimation quality, which in turn is independent of M.
In addition, the number of antennas beyond which the far-field assumption is violated is actually smaller than what one might first think (problem 3.14).