Category Archives: Technical insights

What is Spatial Channel Correlation?

May 26, 2017 Emil Björnson 67 Comments

The channel between a single-antenna user and an $M$ -antenna base station can be represented by an $M$ -dimensional channel vector. The canonical channel model in the Massive MIMO literature is independent and identically distributed (i.i.d.) Rayleigh fading, in which the vector is a circularly symmetric complex Gaussian random variable with a scaled identity matrix as correlation/covariance matrix: $\mathbf{h} \sim CN(\mathbf{0},\beta \mathbf{I}_M)$ , where $\beta$ is the variance.

With i.i.d. Rayleigh fading, the channel gain $\|\mathbf{h}\|^2$ has an Erlang $(M,1/\beta)$ -distribution (this is a scaled $\chi^2$ distribution) and the channel direction $\mathbf{h} / \|\mathbf{h}\|$ is uniformly distributed over the unit sphere in $\mathbb{C}^M$ . The channel gain and the channel direction are also independent random variables, which is why this is a spatially uncorrelated channel model.

One of the key benefits of i.i.d. Rayleigh fading is that one can compute closed-form rate expressions, at least when using maximum ratio or zero-forcing processing; see Fundamentals of Massive MIMO for details. These expressions have an intuitive interpretation, but should be treated with care because practical channels are not spatially uncorrelated. Firstly, due to the propagation environment, the channel vector is more probable to point in some directions than in others. Secondly, the antennas have spatially dependent antenna patterns. Both factors contribute to the fact that spatial channel correlation always appears in practice.

One of the basic properties of spatial channel correlation is that the base station array receives different average signal power from different spatial directions. This is illustrated in Figure 1 below for a uniform linear array with 100 antennas, where the angle of arrival is measured from the boresight of the array.

Figure 1: The average signal power received at a Massive MIMO base station from different angular directions, as seen from the array. Spatially correlated fading implies that this average power is angle-dependent, while i.i.d. fading gives the same power in all directions.

As seen from Figure 1, with i.i.d. Rayleigh fading the average received power is equally large from all directions, while with spatially correlated fading it varies depending on in which direction the base station applies its receive beamforming. Note that this is a numerical example that was generated by letting the signal come from four scattering clusters located in different angular directions. Channel measurements from Lund University (see Figure 4 in this paper) show how the spatial correlation behaves in practical scenarios.

Correlated Rayleigh fading is a tractable way to model a spatially correlation channel vector: $\mathbf{h} \sim CN(\mathbf{0}, \mathbf{B})$ , where the covariance matrix $\mathbf{B}$ is also the correlation matrix. It is only when $\mathbf{B}$ is a scaled identity matrix that we have spatially uncorrelated fading. The eigenvalue distribution determines how strongly spatially correlated the channel is. If all eigenvalues are identical, then $\mathbf{B}$ is a scaled identity matrix and there is no spatial correlation. If there are a few strong eigenvalues that contain most of the power, then there is very strong spatial correlation and the channel vector is very likely to be (approximately) spanned by the corresponding eigenvectors. This is illustrated in Figure 2 below, for the same scenario as in the previous figure. In the considered correlated fading case, there are 20 eigenvalues that are larger than in the i.i.d. fading case. These eigenvalues contain 94% of the power, while the next 20 eigenvalues contain 5% and the smallest 60 eigenvalues only contain 1%. Hence, most of the power is concentrated to a subspace of dimension $\leq40$ . The fraction of strong eigenvalues is related to the fraction of the angular interval from which strong signals are received. This relation can be made explicit in special cases.

Figure 2: Spatial channel correlation results in eigenvalue variations, while all eigenvalues are the same under i.i.d fading. The larger the variations, the stronger the correlation is.

One example of spatially correlated fading is when the correlation matrix has equal diagonal elements and non-zero off-diagonal elements, which describe the correlation between the channel coefficients of different antennas. This is a reasonable model when deploying a compact base station array in tower. Another example is a diagonal correlation matrix with different diagonal elements. This is a reasonable model when deploying distributed antennas, as in the case of cell-free Massive MIMO.

Finally, a more general channel model is correlated Rician fading: $\mathbf{h} \sim CN(\mathbf{b}, \mathbf{B})$ , where the mean value $\mathbf{b}$ represents the deterministic line-of-sight channel and the covariance matrix $\mathbf{B}$ determines the properties of the fading. The correlation matrix $\mathbf{B}+\mathbf{b}\mathbf{b}^H$ can still be used to determine the spatial correlation of the received signal power. However, from a system performance perspective, the fraction $k=\| \mathbf{b} \|^2/\mathrm{tr}(\mathbf{B})$ between the power of the line-of-sight path and the scattered paths can have a large impact on the performance as well. A nearly deterministic channel with a large $k$ -factor provide more reliable communication, in particular since under correlated fading it is only the large eigenvalues of $\mathbf{B}$ that contributes to the channel hardening (which otherwise provides reliability in Massive MIMO).

Commentary, Technical insights

Reflections on “Massive MIMO: How Many Antennas Do We Need?”

May 7, 2017 Jakob Hoydis 8 Comments

Sometime last week, the paper “Massive MIMO in the UL/DL of Cellular Networks: How Many Antennas Do We Need?” that I have co-authored reached 1000 citations (according to Google Scholar). I feel that this is a good moment to share some reflections on this work and discuss some conclusions we too hastily drew. The paper is an extension of a conference paper that appeared at the 2011 Allerton Conference. At that time, we could by no means anticipate the impact Massive MIMO would have and many people were quite doubtful about the technology (including myself). I still remember very well a heated discussion with an esteemed Bell Lab’s colleague trying to convince me that there were never ever going to be more than two active RF inputs into a base station!

Looking back, I am always wondering where the term “Massive MIMO” actually comes from. When we wrote our paper, the terms “large-scale antenna systems (LSAS)” or simply “large-scale MIMO” were commonly used to refer to base stations with very large antenna arrays, and I do not recall what made us choose our title.

The Google Trends Chart for “Massive MIMO” above clearly shows that interest in this topic started roughly at the time Tom Marzetta’s seminal paper was published, although the term itself does not appear in it at all. If anyone has an idea or reference where the term “Massive MIMO” was first used, please feel free to write this in the comment field.

In case you have not read our paper, let me first explain the key question it tries to answer. Marzetta showed in his paper that the simplest form of linear receive combining and transmit precoding, namely maximum ratio combining (MRC) and transmission (MRT), respectively, achieve an asymptotic spectral efficiency (when the number of antennas goes to infinity) that is only limited by coherent interference caused by user equipments (UEs) using the same pilot sequences for channel training (see the previous blog post on pilot contamination). All non-coherent interference such as noise, channel gain uncertainty due to estimation errors, and interference magically vanishes thanks to the strong law of large numbers and favorable propagation. Intrigued by this beautiful result, we wanted to know what happens for a large but finite number of antennas $M$ . Clearly, MRC/MRT are not optimal in this regime, and we wanted to quantify how much can be gained by using more advanced combining/precoding schemes. In other words, our goal was to figure out how many antennas could be “saved” by computing a matrix inverse, which is the key ingredient of the more sophisticated schemes, such as MMSE combining or regularized zero-forcing (RZF) precoding. Moreover, we wanted to compute how much of the asymptotic spectral efficiency can be achieved with $M$ antennas. Please read our paper if you are interested in our findings.

What is interesting to notice is that we (and many other researchers) had always taken the following facts about Massive MIMO for granted and repeated them in numerous papers without further questioning:

Due to pilot contamination, Massive MIMO has a finite asymptotic capacity
MRC/MRT are asymptotically optimal
More sophisticated receive combining and transmit precoding schemes can only improve the performance for finite $M$

We have recently uploaded a new paper on Arxiv which proves that all of these “facts” are incorrect and essentially artifacts from using simplistic channel models and suboptimal precoding/combining schemes. What I find particularly amusing is that we have come to this result by carefully analyzing the asymptotic performance of the multicell MMSE receive combiner that I mentioned but rejected in the 2011 Allerton paper. To understand the difference between the widely used single-cell MMSE (S-MMSE) combining and the (not widely used) multicell MMSE (M-MMSE) combining, let us look at their respective definitions for a base station located in cell $j$ :

$\mathbf{V}^{\textrm{M-MMSE}}_j = \left( \sum_{l=1}^{L} \hat{\mathbf{H}}_l \hat{\mathbf{H}}_l^H + \sum_{l=1}^L \sum_{i=1}^K \mathbf{C}_{li} + \sigma^2 \mathbf{I}_M \right)^{-1} \hat{\mathbf{H}}_{j}$

$\mathbf{V}^{\textrm{S-MMSE}}_j = \left( \hat{\mathbf{H}}_j \hat{\mathbf{H}}_j^H + \sum_{i=1}^K \mathbf{C}_{ji} + \sum_{l=1, l\neq j}^L \sum_{i=1}^K \mathbf{R}_{li} + \sigma^2 \mathbf{I}_M \right)^{-1} \hat{\mathbf{H}}_{j}$

where $L$ and $K$ denote the number of cells and UEs per cell, $\hat{\mathbf{H}}_j\in \mathbb{C}^{M\times K}$ is the estimated channel matrix from the UEs in cell $j$ , and $\mathbf{R}_{li}$ and $\mathbf{C}_{li}$ are the covariance matrices of the channel and the channel estimation errors of UE $i$ in cell $l$ , respectively. While M-MMSE combining uses estimates of the channels from all UEs in all cells, the simpler S-MMSE combining uses only channel estimates from the UEs in the own cell. Importantly, we show that Massive MIMO with M-MMSE combining has unlimited capacity while Massive MIMO with S-MMSE combining has not! This behavior is shown in the following figure:

In the light of this new result, I wish that we would not have made the following remark in our 2011 Allerton paper:

“Note that a BS could theoretically estimate
all channel matrices $\mathbf{H}_l$ (…) to further
improve the performance. Nevertheless, high path loss to
neighboring cells is likely to render these channel estimates unreliable and the potential performance gains are expected to be marginal.”

We could not have been more wrong about it!

In summary, although we did not understand the importance of M-MMSE combining in 2011, I believe that we were asking the right questions. In particular, the consideration of individual channel covariance matrices for each UE has been an important step for the analysis of Massive MIMO systems. A key lesson that I have learned from this story for my own research is that one should always question fundamental assumptions and wisdom.

5G, Commentary, Technical insights

Massive MIMO at 60 GHz vs. 2 GHz: How Many More Antennas?

May 4, 2017 Erik G. Larsson Leave a comment

The Brooklyn summit last week was a great event. I gave a talk (here are the slides) comparing MIMO at “PCS” (2 GHz) and mmWave (60 GHz) in line-of-sight. There are two punchlines: first, scientifically, while a link budget calculation might predict that 128.000 mmWave antennas are needed to match up the performance of 128-antenna PCS MIMO, there is a countervailing effect in that increasing the number of antennas improves channel orthogonality so that only 10.000 antennas are required. Second, practically, although 10.000 is a lot less than 128.000, it is still a very large number! Here is a writeup with some more detail on the comparison.

I also touched the (for sub-5 GHz bands somewhat controversial) topic of hybrid beamforming, and whether that would reduce the required amount of hardware.

A question from the audience was whether the use of antennas with larger physical aperture (i.e., intrinsic directivity) would change the conclusions. The answer is no: the use of directional antennas is more or less equivalent to sectorization. The problem is that to exploit the intrinsic gain, the antennas must a priori point “in the right direction”. Hence, in the array, only a subset of the antennas will be useful when serving a particular terminal. This impacts both the channel gain (reduced effective aperture) and orthogonality (see, e.g, Figure 7.5 in this book).

There was also a stimulating panel discussion afterwards. One question discussed in the panel concerned the necessity, or desirability, of using multiple terminal antennas at mmWave. Looking only at the link budget, base station antennas could be traded against terminal antennas – however, that argument neglects the inevitably lost orthogonality, and furthermore it is not obvious how beam-finding/tracking algorithms will perform (millisecond coherence time at pedestrian speeds!). Also, obviously, the comparison I presented is extremely simplistic – to begin with, the line-of-sight scenario is extremely favorable for mmWaves (blocking problems), but also, I entirely neglected polarization losses. Solely any attempts to compensate for these problems are likely to require multiple terminal antennas.

Other topics touched in the panel were the viability of Massive MIMO implementations. Perhaps the most important comment in this context made was by Ian Wong of National Instruments: “In the past year, we’ve actually shown that [massive MIMO] works in reality … To me, the biggest development is that the skeptics are being quiet.” (Read more about that here.)

5G, Beyond 5G, News, Technical insights

How Much Performance is Lost by FDD Operation?

April 4, 2017 Erik G. Larsson 1 Comment

There has been a long-standing debate on the relative performance between reciprocity-based (TDD) Massive MIMO and that of FDD solutions based on grid-of-beams, or hybrid beamforming architectures. The matter was, for example, the subject of a heated debate in the 2015 Globecom industry panel “Massive MIMO vs FD-MIMO: Defining the next generation of MIMO in 5G” where on the one hand, the commercial arguments for grid-of-beams solutions were clear, but on the other hand, their real potential for high-performance spatial multiplexing was strongly contested.

While it is known that grid-of-beams solutions perform poorly in isotropic scattering, no prior experimental results are known. This new paper:

Massive MIMO Performance—TDD Versus FDD: What Do Measurements Say?

answers this performance question through the analysis of real Massive MIMO channel measurement data obtained at the 2.6 GHz band. Except for in certain line-of-sight (LOS) environments, the original reciprocity-based TDD Massive MIMO represents the only effective implementation of Massive MIMO at the frequency bands under consideration.

Beyond 5G, Commentary, Technical insights

Relative Value of Spectrum

March 24, 2017 Erik G. Larsson 22 Comments

What is more worth? 1 MHz bandwidth at 100 MHz carrier frequency, or 10 MHz bandwidth at 1 GHz carrier? Conventional wisdom has it that higher carrier frequencies are more valuable because “there is more bandwidth there”. In this post, I will explain why that is not entirely correct.

The basic presumption of TDD/reciprocity-based Massive MIMO is that all activity, comprising the transmission of uplink pilots, uplink data and downlink data, takes place inside of a coherence interval:

At fixed mobility, in meter/second, the dimensionality of the coherence interval is proportional to the wavelength, because the Doppler spread is proportional to the carrier frequency.

In a single cell, with max-min fairness power control (for uniform quality-of-service provision), the sum-throughput of Massive MIMO can be computed analytically and is given by the following formula:

In this formula,

$B$ = bandwidth in Hertz (split equally between uplink and downlink)
$M$ = number of base station antennas
$K$ = number of multiplexed terminals
$B_c$ = coherence bandwidth in Hertz (independent of carrier frequency)
$T_c$ = coherence time in seconds (inversely proportional to carrier frequency)
SNR = signal-to-noise ratio (“normalized transmit power”)
$\beta_k$ = path loss for the k:th terminal
$\gamma_k$ = constant, close to $\beta_k$ with sufficient pilot power

This formula assumes independent Rayleigh fading, but the general conclusions remain under other models.

The factor that pre-multiplies the logarithm depends on $K$ .
The pre-log factor is maximized when $K=B_c T_c/2$ . The maximal value is $B B_c T_c/8$ , which is proportional to $T_c$ , and therefore proportional to the wavelength. Due to the multiplication $B T_c$, one can get same pre-log factor using a smaller bandwidth by instead increasing the wavelength, i.e., reducing the carrier frequency. At the same time, assuming appropriate scaling of the number of antennas, $M$ , with the number of terminals, $K$ , the quantity inside of the logarithm is a constant.

Concluding, the sum spectral efficiency (in b/s/Hz) easily can double for every doubling of the wavelength: a megahertz of bandwidth at 100 MHz carrier is ten times more worth than a megahertz of bandwidth at a 1 GHz carrier. So while there is more bandwidth available at higher carriers, the potential multiplexing gains are correspondingly smaller.

In this example,

all three setups give the same sum-throughput, however, the throughput per terminal is vastly different.

5G, Commentary, Technical insights

Improving the Cell-Edge Performance

March 2, 2017 Emil Björnson 8 Comments

The cellular network that my smartphone connects to normally delivers 10-40 Mbit/s. That is sufficient for video-streaming and other applications that I might use. Unfortunately, I sometimes have poor coverage and then I can barely download emails or make a phone call. That is why I think that providing ubiquitous data coverage is the most important goal for 5G cellular networks. It might also be the most challenging 5G goal, because the area coverage has been an open problem since the first generation of cellular technology.

It is the physics that make it difficult to provide good coverage. The transmitted signals spread out and only a tiny fraction of the transmitted power reaches the receive antenna (e.g., one part of a billion parts). In cellular networks, the received signal power reduces roughly as the propagation distance to the power of four. This results in the following data rate coverage behavior:

Figure 1: Variations in the downlink data rates in an area covered by nine base stations.

This figure considers an area covered by nine base stations, which are located at the middle of the nine peaks. Users that are close to one of the base stations receive the maximum downlink data rate, which in this case is 60 Mbit/s (e.g., spectral efficiency 6 bit/s/Hz over a 10 MHz channel). As a user moves away from a base station, the data rate drops rapidly. At the cell edge, where the user is equally distant from multiple base stations, the rate is nearly zero in this simulation. This is because the received signal power is low as compared to the receiver noise.

What can be done to improve the coverage?

One possibility is to increase the transmit power. This is mathematically equivalent to densifying the network, so that the area covered by each base station is smaller. The figure below shows what happens if we use 100 times more transmit power:

Figure 2: The transmit powers have been increased 100 times as compared to Figure 1.

There are some visible differences as compared to Figure 1. First, the region around the base station that gives 60 Mbit/s is larger. Second, the data rates at the cell edge are slightly improved, but there are still large variations within the area. However, it is no longer the noise that limits the cell-edge rates—it is the interference from other base stations.

The inter-cell interference remains even if we would further increase the transmit power. The reason is that the desired signal power as well as the interfering signal power grow in the same manner at the cell edge. Similar things happen if we densify the network by adding more base stations, as nicely explained in a recent paper by Andrews et al.

Ideally, we would like to increase only the power of the desired signals, while keeping the interference power fixed. This is what transmit precoding from a multi-antenna array can achieve; the transmitted signals from the multiple antennas at the base station add constructively only at the spatial location of the desired user. More precisely, the signal power is proportional to M (the number of antennas), while the interference power caused to other users is independent of M. The following figure shows the data rates when we go from 1 to 100 antennas:

Figure 3: The number of base station antennas has been increased from 1 (as in Figure 1) to 100.

Figure 3 shows that the data rates are increased for all users, but particularly for those at the cell edge. In this simulation, everyone is now guaranteed a minimum data rate of 30 Mbit/s, while 60 Mbit/s is delivered in a large fraction of the coverage area.

In practice, the propagation losses are not only distant-dependent, but also affected by other large-scale effects, such as shadowing. The properties described above remain nevertheless. Coherent precoding from a base station with many antennas can greatly improve the data rates for the cell edge users, since only the desired signal power (and not the interference power) is increased. Higher transmit power or smaller cells will only lead to an interference-limited regime where the cell-edge performance remains to be poor. A practical challenge with coherent precoding is that the base station needs to learn the user channels, but reciprocity-based Massive MIMO provides a scalable solution to that. That is why Massive MIMO is the key technology for delivering ubiquitous connectivity in 5G.

5G, Commentary, Technical insights

More Bandwidth Requires More Power or Antennas

February 11, 2017 Emil Björnson 2 Comments

The main selling point of millimeter-wave communications is the abundant bandwidth available in such frequency bands; for example, 2 GHz of bandwidth instead of 20 MHz as in conventional cellular networks. The underlying argument is that the use of much wider bandwidths immediately leads to much higher capacities, in terms of bit/s, but the reality is not that simple.

To look into this, consider a communication system operating over a bandwidth of $B$ Hz. By assuming an additive white Gaussian noise channel, the capacity becomes

$C = B \log_2 \left(1+\frac{P \beta}{N_0 B} \right)$

where $P$ W is the transmit power, $\beta$ is the channel gain, and $N_0$ W/Hz is the power spectral density of the noise. The term $(P \beta)/(N_0 B)$ inside the logarithm is referred to as the signal-to-noise ratio (SNR).

Since the bandwidth $B$ appears in front of the logarithm, it might seem that the capacity grows linearly with the bandwidth. This is not the case since also the noise term $N_0 B$ in the SNR also grows linearly with the bandwidth. This fact is illustrated by Figure 1 below, where we consider a system that achieves an SNR of 0 dB at a reference bandwidth of 20 MHz. As we increase the bandwidth towards 2 GHz, the capacity grows only modestly. Despite the 100 times more bandwidth, the capacity only improves by $1.44\times$ , which is far from the $100\times$ that a linear increase would give.

Figure 1: Capacity as a function of the bandwidth, for a system with an SNR of 0 dB over a reference bandwidth of 20 MHz. The transmit power is fixed.

The reason for this modest capacity growth is the fact that the SNR reduces inversely proportional to the bandwidth. One can show that

$C \to \frac{P \beta}{N_0}\log_2(e) \quad \textrm{as} \,\, B \to \infty.$

The convergence to this limit is seen in Figure 1 and is relatively fast since $\log_2(1+x) \approx x \log_2(e)$ for $0 \leq x \leq 1$ .

To achieve a linear capacity growth, we need to keep the SNR $(P \beta)/(N_0 B)$ fixed as the bandwidth increases. This can be achieved by increasing the transmit power $P$ proportionally to the bandwidth, which entails using $100\times$ more power when operating over a $100\times$ wider bandwidth. This might not be desirable in practice, at least not for battery-powered devices.

An alternative is to use beamforming to improve the channel gain. In a Massive MIMO system, the effective channel gain is $\beta = \beta_1 M$ , where $M$ is the number of antennas and $\beta_1$ is the gain of a single-antenna channel. Hence, we can increase the number of antennas proportionally to the bandwidth to keep the SNR fixed.

Figure 2: Capacity as a function of the bandwidth, for a system with an SNR of 0 dB over a reference bandwidth of 20 MHz with one antenna. The transmit power (or the number of antennas) is either fixed or grows proportionally to the bandwidth.

Figure 2 considers the same setup as in Figure 1, but now we also let either the transmit power or the number of antennas grow proportionally to the bandwidth. In both cases, we achieve a capacity that grows proportionally to the bandwidth, as we initially hoped for.

In conclusion, to make efficient use of more bandwidth we require more transmit power or more antennas at the transmitter and/or receiver. It is worth noting that these requirements are purely due to the increase in bandwidth. In addition, for any given bandwidth, the operation at millimeter-wave frequencies requires much more transmit power and/or more antennas (e.g., additional constant-gain antennas or one constant-aperture antenna) just to achieve the same SNR as in a system operating at conventional frequencies below 5 GHz.

Wireless Future Blog

Category Archives: Technical insights

What is Spatial Channel Correlation?

Reflections on “Massive MIMO: How Many Antennas Do We Need?”

Massive MIMO at 60 GHz vs. 2 GHz: How Many More Antennas?

How Much Performance is Lost by FDD Operation?

Relative Value of Spectrum

Improving the Cell-Edge Performance

More Bandwidth Requires More Power or Antennas

News – commentary – mythbusting