Approaches to Mitigate Pilot Contamination


Many researchers have analyzed pilot contamination over the six years that have passed since Marzetta uncovered its importance in Massive MIMO systems. We now have a quite good understanding of how to mitigate pilot contamination. There is a plethora of different approaches, whereof many have complementary benefits. If pilot contamination is not mitigated, it will both reduce the array gain and create coherent interference. Some approaches mitigate the pilot interference in the channel estimation phase, while some approaches combat the coherent interference caused by pilot contamination. In this post, I will try to categorize the approaches and point to some key references.

Interference-rejecting precoding and combining

Pilot contamination makes the estimate of a desired channel correlated with the channel from pilot-sharing users in other cells. When these channel estimates are used for receive combining or transmit precoding, coherent interference typically arise. This is particularly the case if maximum ratio processing is used, because it ignores the interference. If multi-cell MMSE processing is used instead, the coherent interference is rejected in the spatial domain. In particular, recent work from Björnson et al. (see also this related paper) have shown that there is no asymptotic rate limit when using this approach, if there is just a tiny amount of spatial correlation in the channels.

Data-aided channel estimation

Another approach is to “decontaminate” the channel estimates from pilot contamination, by using the pilot sequence and the uplink data for joint channel estimation. This have the potential of both improving the estimation quality (leading to a stronger desired signal) and reducing the coherent interference. Ideally, if the data is known, data-aided channel estimation increase the length of the pilot sequences to the length of the uplink transmission block. Since the data is unknown to the receiver, semi-blind estimation techniques are needed to obtain the channel estimates. Ngo et al. and Müller et al. did early works on pilot decontamination for Massive MIMO. Recent work has proved that one can fully decontaminate the estimates, as the length of the uplink block grows large, but it remains to find the most efficient semi-blind decontamination approach for practical block lengths.

Pilot assignment and dimensioning

Which subset of users that share a pilot sequence makes a large difference, since users with large pathloss differences and different spatial channel correlation cause less contamination to each other. Recall that higher estimation quality both increases the gain of the desired signal and reduces the coherent interference. Increasing the number of orthogonal pilot sequences is a straightforward way to decrease the contamination, since each pilot can be assigned to fewer users in the network. The price to pay is a larger pilot overhead, but it seems that a reuse factor of 3 or 4 is often suitable from a sum rate perspective in cellular networks. The joint spatial division and multiplexing (JSDM) provides a basic methodology to take spatial correlation into account in the pilot reuse patterns.

A cellular network with different pilot reuse factors: Reuse 1 (left), Reuse 3 (middle), Reuse 4 (right). The cells with the same color uses the same subset of pilots.

Alternatively, pilot sequences can be superimposed on the data sequences, which gives as many orthogonal pilot sequences as the length of the uplink block and thereby reduces the pilot contamination. This approach also removes the pilot overhead, but it comes at the cost of causing interference between pilot and data transmissions. It is therefore important to assign the right fraction of power to pilots and data. A hybrid pilot solution, where some users have superimposed pilots and some have conventional pilots, may bring the best of both worlds.

If two cells use the same subset of pilots, the exact pilot-user assignment can make a large difference. Cell-center users are generally less sensitive to pilot contamination than cell-edge users, but finding the best assignment is a hard combinatorial problem. There are heuristic algorithms that can be used and also an optimization framework that can be used to evaluate such algorithms.

Multi-cell cooperation

A combination of network MIMO and macro diversity can be utilized to turn the coherent interference into desired signals. This approach is called pilot contamination precoding by Ashikhmin et al. and can be applied in both uplink and downlink. In the uplink, the base stations receive different linear combinations of the user signals. After maximum ratio combining, the coefficients in the linear combinations approach deterministic numbers as the number of antennas grow large. These numbers are only non-zero for the pilot-sharing users. Since the macro diversity naturally creates different linear combinations, the base stations can jointly solve a linear system of equations to obtain the transmitted signals. In the downlink, all signals are sent from all base stations and are precoded in such a way that the coherent interference sent from different base stations cancel out. While this is a beautiful approach for mitigating the coherent interference, it relies heavily on channel hardening, favorable propagation, and i.i.d. Rayleigh fading. It remains to be shown if the approach can provide performance gains under more practical conditions.

14 thoughts on “Approaches to Mitigate Pilot Contamination”

    1. I cannot answer implementation questions related to papers that I haven’t been involved in. I recommend you to contact the authors of the paper!

  1. I am interested in the textbook “Fundamentals of Massive MIMO” by Marzetta, Larsson, Yang, and Ngo.

    Can I get the e-book free of cost from somewhere?

  2. I am very interested in mitigating pilot contamination in massive MIMO system. Do you have any simulation code of massive MIMO pilot contamination mitigating process?

  3. Hello, Professor Bjornson.
    Especially thank you for your good blog and the time you spend to answer to the question in your blog.
    I have two question:
    First, what are the condition for two random vectors to be uncorrelated? (for a finite distribution for example complex normal)
    Is it sufficient for each of them to be spatially uncorrelated?
    Second, is there differences between orthogonality and uncorrelation for two random vectors?

    1. These questions are more related to probability theory than MIMO communications, so I recommend you read a textbook on that topic.

      First question: Two random vectors are uncorrelated if the cross-covariance matrix is zero: https://en.wikipedia.org/wiki/Cross-covariance_matrix
      It doesn’t matter if the individual vectors have correlated elements or not.

      Second question: Two vectors are orthogonal if the dot product is zero: https://en.wikipedia.org/wiki/Dot_product
      This has nothing to whether the vectors are uncorrelated.

  4. Hello, Professor Bjornson.
    As always, Thank you for the time you spend to answer the question.
    I want to know your opinion about this statement.
    Am I right?
    If we use a massive MIMO system with many BS antennas and the users separate from each others sufficiently, because of favourable propagation caused between users we can consider the channels of users uncorrelated and this enable us to have a weak interference and to use linear processors to eliminate the interference efficiently.
    Exactly, I want to know, is favourable propagation a dominant factor in massive MIMO systems to mitigate interference? And to some extent it plays a dominant role?

    1. If you change “uncorrelated” to “nearly orthogonal”, the statement will be correct. The channels of different users are always modeled as being statistically uncorrelated, but what matters for favorable propagation is whether the channels are orthogonal so the channel vectors are likely to be almost orthogonal.

      Favorable propagation is a preferable property but it is not necessary. We will never obtain perfect favorable propagation in practice, which is why regularized zero-forcing and similar schemes are used to mitigate the remaining interference.

  5. Hello, Professor Bjornson.
    Thank you for your answer.
    According to your answer, the channel vectors between the users are assumed statistically uncorrelated, is it possible for you to introduce a reference that explain it more.

    I think if we assume that the channel vectors between users are statistically uncorrelated then we can conclude that the channels are statistically orthogonal because if two channels are statistically uncorrelated then the cross covariance matrix between two channel will be zero, so, because the inner product of two channels is the trace of this matrix, the statistically uncorrelation result in the statistically orthogonality between two channel note that the channels are zero mean.

    So, if you statisticaly consider the orthogonality between the zero mean channels then the uncorrelation result in the orthogonality.
    But if you non statisticaly consider the orthogonality I do not imagine the concept of it because of the randomness of the channels.
    Note that according to the law of large numbers the favourable propagation say that the channels between users will be statistically orthogonal when the number of BS antennas is infinite.

    Finally, I should mention that these are just according to my thoughts and may be I am wrong.
    It is my pleasure to guide me if i am wrong.

    1. These things are explained in Section 2 of my book.

      Orthogonality is defined for a single realization of the channel while uncorrelation is defined on the average of many realizations. Uncorrelation does not imply that every realization gives orthogonality (or not even that any realizations lead to that).

  6. Hello, Professor Bjornson.
    Wishing to be safe and healthy and thank you for the time that you spend to answer to the questions.
    Professor Bjornson.
    In section 2 of your book, in explaining the favorable propagation, you distinguish the channel direction from the channel response. According to the statement in the book, favorable propagation makes the channel direction orthogonal not the channel response.
    A question that arises to my mind is that mathematically, if two vectors have orthogonal direction then their those vectors should be orthogonal.
    Is not there any conflict between your statement and the above statement? Can you guide me more with explaining your statement.

    1. There is an important technical difference between the two things. Two things happen with the channel responses when the number of antennas grow. The directions become more different and their norms become larger.

      If you compute the inner product of the channel responses, it will go to infinity.
      If you compute the inner product of the channel directions, it will go to zero.

      You can view it like this: The interference power that leaks between the users is increasing in absolute terms, but in relative terms (compared to the desired signal) it goes to zero. And it is the latter that matters when measuring performance.

Leave a Reply

Your email address will not be published. Required fields are marked *