How to Normalize a Precoding Matrix?

The transmitted signal \mathbf{x} from an M-antenna base station can consist of multiple information signals that are transmitted using different precoding (e.g., different spatial directivity). When there are K unit-power data signals s_1,\ldots,s_K intended for K different users, the transmitted signal can be expressed as

(1)   \begin{equation*}\mathbf{x} = \sum_{i=1}^{K} \mathbf{w}_i s_i,\end{equation*}

where \mathbf{w}_1,\ldots,\mathbf{w}_K are the M-dimensional precoding vectors assigned to the different users. The direction of the vector \mathbf{w}_i determines the spatial directivity of the signal s_i, while the squared norm \|\mathbf{w}_i\|^2 determines the associated transmit power. Massive MIMO usually means that M\gg K.

When selecting the precoding vectors, we need to make sure that we are not using too much transmit power. If the maximum power is P and we define the M \times K precoding matrix

(2)   \begin{equation*} \mathbf{W} = [\mathbf{w}_1 \, \, \ldots \,\, \mathbf{w}_K],\end{equation*}

then we need to make sure that the squared Frobenius norm of \mathbf{W} equals the maximum transmit power:

(3)   \begin{equation*} \| \mathbf{W} \|_F^2 = P.\end{equation*}

In the Massive MIMO literature, there are two popular methods to achieve that: matrix normalization and vector normalization. The papers [Ref1], [Ref2] consider both methods, while other papers only consider one of the methods. The main idea is to start from an arbitrarily selected precoding matrix  \mathbf{F} = [\mathbf{f}_1 \, \, \ldots \,\, \mathbf{f}_K] and then adapt it to satisfy the power constraint in (3).

Matrix normalization: In this case, we take the matrix \mathbf{F} and scales all the entries with the same number, which is selected to satisfy (3). More precisely, we select

(4)   \begin{equation*}\mathbf{W} = \frac{\sqrt{P}}{\|\mathbf{F} \|_F} \mathbf{F}.\end{equation*}

Vector normalization: In this case, we first normalize each column in \mathbf{F} to have unit norm and then scale them all with \sqrt{P/K} to satisfy (3). More precisely, we select

(5)   \begin{equation*}\mathbf{W} = \sqrt{\frac{P}{K}} \left[ \frac{\mathbf{f}_1}{\| \mathbf{f}_1\|} \, \, \ldots \,\, \frac{\mathbf{f}_K}{\| \mathbf{f}_K\|} \right].\end{equation*}

Which of the two normalizations should be used?

This is a question that I receive now and then, so I wrote this blog post to answer it once and for all. My answer: none of them!

The problem with matrix normalization is that the method that was used to select \mathbf{F} will determine how the transmit power is allocated between the different signals/users. Hence, we are not in control of the power allocation and we cannot fairly compare different precoding schemes. For example, maximum-ratio (MR) allocates more power to users with strong channels than users with weak channels, while zero-forcing (ZF) does the opposite. Hence, if one tries to compare MR and ZF under matrix normalization, the different power allocations will strongly influence the results.

This issue is resolved by vector normalization. However, the problem with vector normalization is that all users are assigned the same amount of power, which is undesirable if some users have strong channels and others have weak channels. One should always make a conscious decision when it comes to power allocation between users.

What we should do instead is to select the precoding matrix as

(6)   \begin{equation*}\mathbf{W} =  \left[ \sqrt{p_1} \frac{\mathbf{f}_1}{\| \mathbf{f}_1\|} \, \, \ldots \,\, \sqrt{p_K} \frac{\mathbf{f}_K}{\| \mathbf{f}_K\|} \right],\end{equation*}

where p_1,\ldots,p_K are variables representing the power assigned to each of the users. These should be carefully selected to maximize some performance goals of the network, such as the sum rate, proportional fairness, or max-min fairness. In any case, the power allocation must be selected to satisfy the constraint

(7)   \begin{equation*} \| \mathbf{W} \|_F^2 =  \sum_{i=1}^{K} p_i = P.\end{equation*}

There are plenty of optimization algorithms that can be used for this purpose. You can find further details, examples, and references in Section 7.1 of my book Massive MIMO networks.

14 thoughts on “How to Normalize a Precoding Matrix?”

  1. Hey 🙂
    Thanks for the post.

    In my paper “An Improved Dropping Algorithm for Line-of-Sight Massive MIMO With Max-Min Power Control”, Fig. 1., we have a general linear precoding model with any desired power allocation (in our paper, we focused on max-min power control though).

    I think a figure may reflect much better what you have explained at the end.

    In addition, I think the most simplified model and yet accurate to explain linear precoding is to have two blocks as in Fig. 1 of my paper.

    First block, is the power allocation coefficients, (diag(p) in my paper), and the second block is the precoding matrix (U in my paper), where the precoding matrix has unit-norm column vectors (that means, the precoding matrix specifies the direction of precoding for each user).

    In this way, we control the direction of precoding for each user with u_i and we control the power with diag(p) elements.

    So, we have a power allocation coefficients first, and then, we have a precoding matrix (unit norm column vector u_i).


  2. Hello Dr.Bjornson
    Specially thank you for your good posts.
    Are the vectors Wi orthonormal? Is there any constraint for Wi to be orthonormal?

    1. The vectors should preferably be normalized, but they don’t have to be orthogonal. In fact, the vectors are normally not orthogonal, not even zero-forcing uses orthogonal precoding vectors. (The zero-forcing vectors are orthogonal to the channel vectors but not to each other.)

  3. How should we normalize the power of Hybrid Precoding with respect to the digital precoding, so that transmitted power remain same in both the cases?

    1. The important thing is that you are in control of how much power that is assigned to each user, so that this is not determined by the normalization but how you are actively deciding to allocate power between the users.

  4. Hello,
    Is there a way to exploit spatial correlation matrices to design a precoding matrix that can improve the capacity in case of MIMO?

    1. Yes! This is important and Chapters 3 and 4 of my book Massive MIMO networks ( explain how to do this. It is basically in the channel estimation phase that one utilizes the spatial correlation matrices to improve the estimation quality and mitigate pilot contamination.

      There is also a body of research that uses only the spatial correlation matrices for precoding, without sending any pilots. One paper on this “Cooperative Multicell Precoding: Rate Region Characterization and Distributed Strategies with Instantaneous and Statistical CSI” (, but you can find many other (older) papers on that topic as well. For example in the reference list of that paper.

  5. Hello Dr. Björnson,

    What is the relationship and difference between zero-forcing and block diagonalization precoding methods?


  6. Hi, Professor Bjornson.
    As you have explained in your book, the SE is dependent on the coding/decoding scheme but we focus on the maximum SE.
    I want to ask when you derive SE for an array and use MR combining in section 2 of your book, you derive maximum SE?
    May using other combining scheme result in better SE?
    Can we prove that the MR combining result in maximum SE?

Leave a Reply

Your email address will not be published. Required fields are marked *