All posts by Emil Björnson

Approaches to Mitigate Pilot Contamination

Many researchers have analyzed pilot contamination over the six years that have passed since Marzetta uncovered its importance in Massive MIMO systems. We now have a quite good understanding of how to mitigate pilot contamination. There is a plethora of different approaches, whereof many have complementary benefits. If pilot contamination is not mitigated, it will both reduce the array gain and create coherent interference. Some approaches mitigate the pilot interference in the channel estimation phase, while some approaches combat the coherent interference caused by pilot contamination. In this post, I will try to categorize the approaches and point to some key references.

Interference-rejecting precoding and combining

Pilot contamination makes the estimate of a desired channel correlated with the channel from pilot-sharing users in other cells. When these channel estimates are used for receive combining or transmit precoding, coherent interference typically arise. This is particularly the case if maximum ratio processing is used, because it ignores the interference. If multi-cell MMSE processing is used instead, the coherent interference is rejected in the spatial domain. In particular, recent work from Björnson et al. (see also this related paper) have shown that there is no asymptotic rate limit when using this approach, if there is just a tiny amount of spatial correlation in the channels.

Data-aided channel estimation

Another approach is to “decontaminate” the channel estimates from pilot contamination, by using the pilot sequence and the uplink data for joint channel estimation. This have the potential of both improving the estimation quality (leading to a stronger desired signal) and reducing the coherent interference. Ideally, if the data is known, data-aided channel estimation increase the length of the pilot sequences to the length of the uplink transmission block. Since the data is unknown to the receiver, semi-blind estimation techniques are needed to obtain the channel estimates. Ngo et al. and Müller et al. did early works on pilot decontamination for Massive MIMO. Recent work has proved that one can fully decontaminate the estimates, as the length of the uplink block grows large, but it remains to find the most efficient semi-blind decontamination approach for practical block lengths.

Pilot assignment and dimensioning

Which subset of users that share a pilot sequence makes a large difference, since users with large pathloss differences and different spatial channel correlation cause less contamination to each other. Recall that higher estimation quality both increases the gain of the desired signal and reduces the coherent interference. Increasing the number of orthogonal pilot sequences is a straightforward way to decrease the contamination, since each pilot can be assigned to fewer users in the network. The price to pay is a larger pilot overhead, but it seems that a reuse factor of 3 or 4 is often suitable from a sum rate perspective in cellular networks. The joint spatial division and multiplexing (JSDM) provides a basic methodology to take spatial correlation into account in the pilot reuse patterns.

A cellular network with different pilot reuse factors: Reuse 1 (left), Reuse 3 (middle), Reuse 4 (right). The cells with the same color uses the same subset of pilots.

Alternatively, pilot sequences can be superimposed on the data sequences, which gives as many orthogonal pilot sequences as the length of the uplink block and thereby reduces the pilot contamination. This approach also removes the pilot overhead, but it comes at the cost of causing interference between pilot and data transmissions. It is therefore important to assign the right fraction of power to pilots and data. A hybrid pilot solution, where some users have superimposed pilots and some have conventional pilots, may bring the best of both worlds.

If two cells use the same subset of pilots, the exact pilot-user assignment can make a large difference. Cell-center users are generally less sensitive to pilot contamination than cell-edge users, but finding the best assignment is a hard combinatorial problem. There are heuristic algorithms that can be used and also an optimization framework that can be used to evaluate such algorithms.

Multi-cell cooperation

A combination of network MIMO and macro diversity can be utilized to turn the coherent interference into desired signals. This approach is called pilot contamination precoding by Ashikhmin et al. and can be applied in both uplink and downlink. In the uplink, the base stations receive different linear combinations of the user signals. After maximum ratio combining, the coefficients in the linear combinations approach deterministic numbers as the number of antennas grow large. These numbers are only non-zero for the pilot-sharing users. Since the macro diversity naturally creates different linear combinations, the base stations can jointly solve a linear system of equations to obtain the transmitted signals. In the downlink, all signals are sent from all base stations and are precoded in such a way that the coherent interference sent from different base stations cancel out. While this is a beautiful approach for mitigating the coherent interference, it relies heavily on channel hardening, favorable propagation, and i.i.d. Rayleigh fading. It remains to be shown if the approach can provide performance gains under more practical conditions.

What is the Purpose of Asymptotic Analysis?

Since its inception, Massive MIMO has been strongly connected with asymptotic analysis. Marzetta’s seminal paper featured an unlimited number of base station antennas. Many of the succeeding papers considered a finite number of antennas, M, and then analyzed the performance in the limit where M\to\infty. Massive MIMO is so tightly connected with asymptotic analysis that reviewers question whether a paper is actually about Massive MIMO if it does not contain an asymptotic part – this has happened to me repeatedly.

Have you reflected over what the purpose of asymptotic analysis is? The goal is not that we should design and deploy wireless networks with a nearly infinite number of antennas. Firstly, it is physically impossible to do that in a finite-sized world, irrespective of whether you let the array aperture grow or pack the antennas more densely. Secondly, the conventional channel models break down, since you will eventually receive more power than you transmitted. Thirdly, the technology will neither be cost nor energy efficient, since the cost/energy grows linearly with M, while the delivered system performance either approaches a finite limit or grows logarithmically with M.

It is important not to overemphasize the implications of asymptotic results. Consider the popular power-scaling law which says that one can use the array gain of Massive MIMO to reduce the transmit power as 1/\sqrt{M} and still approach a non-zero asymptotic rate limit. This type of scaling law has been derived for many different scenarios in different papers. The practical implication is that you can reduce the transmit power as you add more antennas, but the asymptotic scaling law does not prescribe how much you should reduce the power when going from, say, 40 to 400 antennas. It all depends on which rates you want to deliver to your users.

The figure below shows the transmit power in a scenario where we start with 1 W for a single-antenna transmitter and then follow the asymptotic power-scaling law as the number of antennas increases. With M=100 antennas, the transmit power per antenna is just 1 mW, which is unnecessarily low given the fact that the circuits in the corresponding transceiver chain will consume much more power. By using higher transmit power than 1 mW per antenna, we can deliver higher rates to the users, while barely effecting the total power of the base station.

Reducing the transmit power per antenna to 1 mW, or smaller, makes little practical sense, since the transceiver chain will consume much more power irrespective of the transmit power.

Similarly, there is a hardware-scaling law which says that one can increase the error vector magnitude (EVM) proportionally to M^{1/4} and approach a non-zero asymptotic rate limit. The practical implication is that Massive MIMO systems can use simpler hardware components (that cause more distortion) than conventional systems, since there is a lower sensitivity to distortion. This is the foundation on which the recent works on low-bit ADC resolutions builds (see this paper and references therein).

Even the importance of the coherent interference, caused by pilot contamination, is easily overemphasized if one only considers the asymptotic behavior.  For example, the finite rate limit that appears when communicating over i.i.d. Rayleigh fading channels with maximum ratio or zero-forcing processing is only approached in practice if one has around one million antennas.

In my opinion, the purpose of asymptotic analysis is not to understand the asymptotic behaviors themselves, but what the asymptotics can tell us about the performance at practical number of antennas. Here are some usages that I think are particularly sound:

  • Determine what is the asymptotically optimal transmission scheme and then evaluate how it performs in a practical system.
  • Derive large-scale approximations of the rates that are reasonable tight also at practical number of antennas. One can use these approximations to determine which factors that have a dominant impact on the rate or to get a tractable way to optimize system performance (e.g., by transmit power allocation).
  • Determine how far from the asymptotically achievable performance a practical system is.
  • Determine if we can deliver any given user rates by simply deploying enough antennas, or if the system is fundamentally interference limited.
  • Simplify the signal processing by utilizing properties such as channel hardening and favorable propagation. These phenomena can be observed already at 100 antennas, although you will never get a fully deterministic channel or zero inter-user interference in practice.

Some form of Massive MIMO will appear in 5G, but to get a well-designed system we need to focus more on demonstrating and optimizing the performance in practical scenarios (e.g., the key 5G use cases) and less on pure asymptotic analysis.

What is Spatial Channel Correlation?

The channel between a single-antenna user and an M-antenna base station can be represented by an M-dimensional channel vector. The canonical channel model in the Massive MIMO literature is independent and identically distributed (i.i.d.) Rayleigh fading, in which the vector is a circularly symmetric complex Gaussian random variable with a scaled identity matrix as correlation/covariance matrix: \mathbf{h} \sim CN(\mathbf{0},\beta \mathbf{I}_M), where \beta is the variance.

With i.i.d. Rayleigh fading, the channel gain \|\mathbf{h}\|^2 has an Erlang(M,1/\beta)-distribution (this is a scaled \chi^2 distribution) and the channel direction \mathbf{h} / \|\mathbf{h}\| is uniformly distributed over the unit sphere in \mathbb{C}^M. The channel gain and the channel direction are also independent random variables, which is why this is a spatially uncorrelated channel model.

One of the key benefits of i.i.d. Rayleigh fading is that one can compute closed-form rate expressions, at least when using maximum ratio or zero-forcing processing; see Fundamentals of Massive MIMO for details. These expressions have an intuitive interpretation, but should be treated with care because practical channels are not spatially uncorrelated. Firstly, due to the propagation environment, the channel vector is more probable to point in some directions than in others. Secondly, the antennas have spatially dependent antenna patterns. Both factors contribute to the fact that spatial channel correlation always appears in practice.

One of the basic properties of spatial channel correlation is that the base station array receives different average signal power from different spatial directions. This is illustrated in Figure 1 below for a uniform linear array with 100 antennas, where the angle of arrival is measured from the boresight of the array.

Figure 1: The average signal power received at a Massive MIMO base station from different angular directions, as seen from the array. Spatially correlated fading implies that this average power is angle-dependent, while i.i.d. fading gives the same power in all directions.


As seen from Figure 1, with i.i.d. Rayleigh fading the average received power is equally large from all directions, while with spatially correlated fading it varies depending on in which direction the base station applies its receive beamforming. Note that this is a numerical example that was generated by letting the signal come from four scattering clusters located in different angular directions. Channel measurements from Lund University (see Figure 4 in this paper) show how the spatial correlation behaves in practical scenarios.

Correlated Rayleigh fading is a tractable way to model a spatially  correlation channel vector: \mathbf{h} \sim CN(\mathbf{0}, \mathbf{B}), where the covariance matrix \mathbf{B} is also the correlation matrix. It is only when \mathbf{B} is a scaled identity matrix that we have spatially uncorrelated fading. The eigenvalue distribution determines how strongly spatially correlated the channel is. If all eigenvalues are identical, then \mathbf{B} is a scaled identity matrix and there is no spatial correlation. If there are a few strong eigenvalues that contain most of the power, then there is very strong spatial correlation and the channel vector is very likely to be (approximately) spanned by the corresponding eigenvectors. This is illustrated in Figure 2 below, for the same scenario as in the previous figure. In the considered correlated fading case, there are 20 eigenvalues that are larger than in the i.i.d. fading case. These eigenvalues contain 94% of the power, while the next 20 eigenvalues contain 5% and the smallest 60 eigenvalues only contain 1%. Hence, most of the power is concentrated to a subspace of dimension \leq40. The fraction of strong eigenvalues is related to the fraction of the angular interval from which strong signals are received. This relation can be made explicit in special cases.

Figure 2: Spatial channel correlation results in eigenvalue variations, while all eigenvalues are the same under i.i.d fading. The larger the variations, the stronger the correlation is.


One example of spatially correlated fading is when the correlation matrix has equal diagonal elements and non-zero off-diagonal elements, which describe the correlation between the channel coefficients of different antennas. This is a reasonable model when deploying a compact base station array in tower. Another example is a diagonal correlation matrix with different diagonal elements. This is a reasonable model when deploying distributed antennas, as in the case of cell-free Massive MIMO.

Finally, a more general channel model is correlated Rician fading: \mathbf{h} \sim CN(\mathbf{b}, \mathbf{B}), where the mean value \mathbf{b} represents the deterministic line-of-sight channel and the covariance matrix \mathbf{B} determines the properties of the fading. The correlation matrix \mathbf{B}+\mathbf{b}\mathbf{b}^H can still be used to determine the spatial correlation of the received signal power. However, from a system performance perspective, the fraction k=\| \mathbf{b} \|^2/\mathrm{tr}(\mathbf{B}) between the power of the line-of-sight path and the scattered paths can have a large impact on the performance as well. A nearly deterministic channel with a large  k-factor provide more reliable communication, in particular since under correlated fading it is only the large eigenvalues of \mathbf{B} that contributes to the channel hardening (which otherwise provides reliability in Massive MIMO).

Reproducible Massive MIMO Research

Reproducibility is fundamental to scientific research. If you develop a new algorithm and use simulations/experiments to claim its superiority over prior algorithms, your claims are only credible if other researchers can reproduce and confirm them.

The first step towards reproducibility is to describe the simulation procedure in such detail that another researcher can repeat the simulation, but a major effort is typically needed to reimplement everything. The second step is to make the simulation code publicly available, so that any scientist can review it and easily reproduce the results. While the first step is mandatory for publishing a scientific study, there is a movement towards open science that would make also the second step a common practice.

I understand that some researchers are skeptical towards sharing their simulation code, in fear of losing their competitive advantage towards other research groups. My personal principle is to not share any code until the research study is finished and the results have been accepted for publication in a full-length journal. After that, I think that the society benefits the most if other researcher can focus on improving my and others’ research, instead of spending excessive amount of time on reimplementing known algorithms. I also believe that the primary competitive advantage in research is the know-how and technical insights, while the simulation code is of secondary importance.

On my GitHub page, I have published Matlab code packages that reproduces the simulation results in one book, one book chapter, and more than 15 peer-reviewed articles. Most of these publications are related to MIMO or Massive MIMO. I see many benefits from doing this:

1) It increases the credibility of my research group’s work;

2) I write better code when I know that other people will read it;

3) Other researchers can dedicate their time into developing new improved algorithms and compare them with my baseline implementations;

4) Young scientists may learn how to implement a basic simulation environment by reading the code.

I hope that other Massive MIMO researchers will also make their simulation code publicly available. Maybe you have already done that? In that case, please feel free to write a comment to this post with a link to your code.

Book Review: The 5G Myth

The 5G Myth is the provocative title of a recent book by William Webb, CEO of Weightless SIG, a standard body for IoT/M2M technology. In this book, the author tells a compelling story of a stagnating market for cellular communications, where the customers are generally satisfied with the data rates delivered by the 4G networks. The revenue growth for the mobile network operators (MNOs) is relatively low and also in decay, since the current services are so good that the customers are unwilling to pay more for improved service quality. Although many new wireless services have materialized over the past decade (e.g., video streaming, social networks, video calls, mobile payment, and location-based services), the MNOs have failed to take the leading role in any of them. Instead, the customers make use of external services (e.g., Youtube, Facebook, Skype, Apple Pay, and Google Maps) and only pay the MNOs to deliver the data bits.

The author argues that, under these circumstances, the MNOs have little to gain from investing in 5G technology. Most customers are not asking for any of the envisaged 5G services and will not be inclined to pay extra for them. Webb even compares the situation with the prisoner’s dilemma: the MNOs would benefit the most from not investing in 5G, but they will anyway make investments to avoid a situation where customers switch to a competitor that has invested in 5G. The picture that Webb paints of 5G is rather pessimistic compared to a recent McKinsey report, where the more cost-efficient network operation is described as a key reason for MNOs to invest in 5G.

The author provides a refreshing description of the market for cellular communications, which is important in a time when the research community focuses more on broad 5G visions than on the customers’ actual needs. The book is thus a recommended read for 5G researchers, since we should all ask ourselves if we are developing a technology that tackles the right unsolved problems.

Webb does not only criticize the economic incentives for 5G deployment, but also the 5G visions and technologies in general. The claims are in many cases reasonable; for example, Webb accurately points out that most of the 5G performance goals are overly optimistic and probably only required by a tiny fraction of the user base. He also accurately points out that some “5G applications” already have a wireless solution (e.g., indoor IoT devices connected over WiFi) or should preferably be wired (e.g., ultra-reliable low-latency applications such as remote surgery).

However, it is also in this part of the book that the argumentation sometimes falls short. For example, Webb extrapolates a recent drop in traffic growth to claim that the global traffic volume will reach a plateau in 2027. It is plausible that the traffic growth rate will reduce as a larger and larger fraction of the global population gets access to wireless high-speed connections. But one should bear in mind that we have witnessed an exponential growth in wireless communication traffic for the past century (known as Cooper’s law), so this trend can just as well continue for a few more decades, potentially at a lower growth rate than in the past decade.

Webb also provides a misleading description of multiuser MIMO by claiming that 1) the antenna arrays would be unreasonable large at cellular frequencies and 2) the beamforming requires complicated angular beam-steering. These are two of the myths that we dispelled in the paper “Massive MIMO: Ten myths and one grand question” last year. In fact, testbeds have demonstrated that massive multiuser MIMO is feasible in lower frequency bands, and particularly useful to improve the spectral efficiency through coherent beamforming and spatial multiplexing of users. Reciprocity-based beamforming is a solution for mobile and cell-edge users, for which angular beam-steering indeed is inefficient.

The book is not as pessimistic about the future as it might seem from this review. Webb provides an alternative vision for future wireless communications, where consistent connectivity rather than higher peak rates is the main focus. This coincides with one of the 5G performance goals (i.e., 50 Mbit/s everywhere), but Webb advocates an extensive government-supported deployment of WiFi instead of 5G technology. The use WiFi is not a bad idea; I personally consume relatively little cellular data since WiFi is available at home, at work, and at many public locations in Sweden. However, the cellular services are necessary to realize the dream of consistent connectivity, particularly outdoors and when in motion. This is where a 5G cellular technology that delivers better coverage and higher data rates at the cell edge is highly desirable. Reciprocity-based Massive MIMO seems to be the solution that can deliver this, thus Webb would have had a stronger case if this technology was properly integrated into his vision.

In summary, the combination of 5G Massive MIMO for wide-area coverage and WiFi for local-area coverage might be the way to truly deliver consistent connectivity.

Teaching the Principles of Massive MIMO

In January this year, the IEEE Signal Processing Magazine contained an article by Erik G. Larsson, Danyo Danev, Mikael Olofsson, and Simon Sörman on “Teaching the Principles of Massive MIMO: Exploring reciprocity-based multiuser MIMO beamforming using acoustic waves“. It describes an exciting approach to teach the basics of Massive MIMO communication by implementing the system acoustically, using loudspeaker elements instead of antennas. The fifth-year engineering students at Linköping University have performed such implementations in 2014, 2015, and 2016, in the form of a conceive-design-implement-operate (CDIO) project.

The article details the teaching principles and experiences that the teachers and students had from the 2015 edition of the CDIO-project. This was also described in a previous blog post. In the following video, the students describe and demonstrate the end-result of the 2016 edition of the project. The acoustic testbed is now truly massive, since 64 loudspeakers were used.

More Demanding Massive MIMO Trials Using the Bristol Testbed

Last year, the 128-antenna Massive MIMO testbed at University of Bristol was used to set world records in per-cell spectral efficiency. Those measurements were conducted in a controlled indoor environment, but demonstrated that the theoretical gains of the technology are also practically achievable—at least in simple propagation scenarios.

The Bristol team has now worked with British Telecom and conducted trials at their site in Adastral Park, Suffolk, in more demanding user scenarios. In the indoor exhibition hall trial,  24 user streams were multiplexed over a 20 MHz bandwidth, resulting in a sum rate of 2 Gbit/s or a spectral efficiency of 100 bit/s/Hz/cell.

Several outdoor experiments were also conducted, which included user mobility. We are looking forward to see more details on these experiments, but in the meantime one can have a look at the following video:

Update: We have corrected the bandwidth number in this post.