4.6 Change of Basis
Adapting Coordinate Systems for Problem Solving
A suitable basis for one problem may not be suitable for another. Changing bases is analogous to changing coordinate axes in \(R^{2}\) and \(R^{3}\).
If \(S = \{\mathbf{v}_{1}, \mathbf{v}_{2}, \ldots , \mathbf{v}_{n}\}\) is a basis for \(V\), and \((\mathbf{v})_{S} = (c_{1}, c_{2}, \ldots , c_{n})\) is the coordinate vector of \(\mathbf{v}\) relative to \(S\), then the mapping: \[ \mathbf{v} \to (\mathbf{v})_{S} \tag{1} \] creates a one-to-one correspondence between vectors in \(V\) and vectors in \(R^{n}\). We call this the coordinate map relative to \(S\) from \(V\) to \(R^{n}\).
For convenience, coordinate vectors are often expressed in matrix form: \[ [\mathbf{v}]_{S} = \left[ \begin{array}{c}c_{1} \\ c_{2} \\ \vdots \\ c_{n} \end{array} \right] \tag{2} \] This matrix notation emphasizes its role in linear transformations.
Figure 4.6.1: Coordinate map from \(V\) to \(R^n\).
A common problem arises when we have a vector \(\mathbf{v}\) and want to express its coordinates in a different basis.
The Change of Basis Problem
If \(\mathbf{v}\) is a vector in a finite-dimensional vector space \(V\), and if we change the basis for \(V\) from a basis \(B\) to a basis \(B'\), how are the coordinate vectors \([\mathbf{v}]_{B}\) and \([\mathbf{v}]_{B'}\) related?
Remark: We will refer to \(B\) as the “old basis” and \(B'\) as the “new basis.” Our objective is to find a relationship between the old and new coordinates of a fixed vector \(\mathbf{v}\) in \(V\).
Let’s derive the relationship for a 2-dimensional space first.
Let \(B = \{\mathbf{u}_{1}, \mathbf{u}_{2}\}\) be the old basis and \(B' = \{\mathbf{u}_{1}', \mathbf{u}_{2}'\}\) be the new basis.
Express \(\mathbf{v}\) in terms of the new basis: Let \([\mathbf{v}]_{B'} = \begin{bmatrix} k_{1} \\ k_{2} \end{bmatrix}\) be the new coordinate vector, so: \[ \mathbf{v} = k_{1}\mathbf{u}_{1}^{\prime} + k_{2}\mathbf{u}_{2}^{\prime} \tag{6} \]
Substitute and find old coordinates: Substitute (4) into (6) to express \(\mathbf{v}\) in terms of the old basis: \[ \mathbf{v} = k_{1}(a\mathbf{u}_{1} + b\mathbf{u}_{2}) + k_{2}(c\mathbf{u}_{1} + d\mathbf{u}_{2}) \] Rearranging terms: \[ \mathbf{v} = (k_{1}a + k_{2}c)\mathbf{u}_{1} + (k_{1}b + k_{2}d)\mathbf{u}_{2} \] Thus, the old coordinate vector for \(\mathbf{v}\) is: \[ [\mathbf{v}]_{B} = \begin{bmatrix} k_{1}a + k_{2}c \\ k_{1}b + k_{2}d \end{bmatrix} \]
The relationship found can be expressed as a matrix multiplication.
Using the new coordinate vector \([\mathbf{v}]_{B'} = \begin{bmatrix} k_{1} \\ k_{2} \end{bmatrix}\), the old coordinate vector can be written as: \[ [\mathbf{v}]_{B} = \left[ \begin{array}{ll}a & c \\ b & d \end{array} \right]\left[ \begin{array}{l}k_{1} \\ k_{2} \end{array} \right] = \left[ \begin{array}{ll}a & c \\ b & d \end{array} \right][\mathbf{v}]_{B^{\prime}} \]
The matrix \(P = \left[ \begin{array}{ll}a & c \\ b & d \end{array} \right]\) is called the transition matrix. Notice that the columns of \(P\) are \([\mathbf{u}_{1}']_{B}\) and \([\mathbf{u}_{2}']_{B}\).
This generalizes to \(n\)-dimensional spaces.
Solution of the Change of Basis Problem
If we change the basis for a vector space \(V\) from an old basis \(B = \{\mathbf{u}_{1}, \ldots , \mathbf{u}_{n}\}\) to a new basis \(B^{\prime} = \{\mathbf{u}_{1}^{\prime}, \ldots , \mathbf{u}_{n}^{\prime}\}\), then for each vector \(\mathbf{v}\) in \(V\), the old coordinate vector \([\mathbf{v}]_{B}\) is related to the new coordinate vector \([\mathbf{v}]_{B^{\prime}}\) by the equation: \[ [\mathbf{v}]_{B} = P[\mathbf{v}]_{B^{\prime}} \tag{7} \] where the columns of \(P\) are the coordinate vectors of the new basis vectors relative to the old basis; that is, the column vectors of \(P\) are: \[ [\mathbf{u}_{1}^{\prime}]_{B}, \quad [\mathbf{u}_{2}^{\prime}]_{B}, \ldots , \quad [\mathbf{u}_{n}^{\prime}]_{B} \tag{8} \]
We often use specific notation to clarify the direction of the transition.
The matrix \(P\) in Equation (7) is called the transition matrix from \(B^{\prime}\) to \(B\), denoted by \(P_{B^{\prime} \rightarrow B}\). It can be expressed as: \[ P_{B^{\prime} \rightarrow B} = \left[ [\mathbf{u}_{1}^{\prime}]_{B} \mid [\mathbf{u}_{2}^{\prime}]_{B} \mid \dots \mid [\mathbf{u}_{n}^{\prime}]_{B} \right] \tag{9} \]
Similarly, the transition matrix from \(B\) to \(B^{\prime}\) is \(P_{B \rightarrow B^{\prime}}\), and its columns are: \[ P_{B \rightarrow B^{\prime}} = \left[ [\mathbf{u}_{1}]_{B^{\prime}} \mid [\mathbf{u}_{2}]_{B^{\prime}} \mid \dots \mid [\mathbf{u}_{n}]_{B^{\prime}} \right] \tag{10} \]
Tip
Memory Aid: The columns of the transition matrix from an old basis to a new basis are the coordinate vectors of the old basis relative to the new basis. (This statement in the text seems to have a typo. It should be: “The columns of the transition matrix from a new basis to an old basis are the coordinate vectors of the new basis relative to the old basis.”) Let’s stick to the definition: \(P_{B' \to B}\) takes coordinates in \(B'\) to \(B\). Its columns are the new basis vectors expressed in the old basis.
Consider bases \(B = \{\mathbf{u}_{1}, \mathbf{u}_{2}\}\) and \(B^{\prime} = \{\mathbf{u}_{1}^{\prime}, \mathbf{u}_{2}^{\prime}\}\) for \(R^{2}\), where:
\(\mathbf{u}_{1} = (1,0)\), \(\mathbf{u}_{2} = (0,1)\)
\(\mathbf{u}_{1}^{\prime} = (1,1)\), \(\mathbf{u}_{2}^{\prime} = (2,1)\)
New basis vectors in terms of old basis \(B\):
\(\mathbf{u}_{1}^{\prime} = 1\mathbf{u}_{1} + 1\mathbf{u}_{2} \implies [\mathbf{u}_{1}^{\prime}]_{B} = \begin{bmatrix} 1 \\ 1 \end{bmatrix}\)
\(\mathbf{u}_{2}^{\prime} = 2\mathbf{u}_{1} + 1\mathbf{u}_{2} \implies [\mathbf{u}_{2}^{\prime}]_{B} = \begin{bmatrix} 2 \\ 1 \end{bmatrix}\)
So, \(P_{B^{\prime} \rightarrow B} = \begin{bmatrix} 1 & 2 \\ 1 & 1 \end{bmatrix}\).
Old basis vectors in terms of new basis \(B'\):
We need to find \(c_1, c_2\) such that \(\mathbf{u}_{1} = c_{1}\mathbf{u}_{1}^{\prime} + c_{2}\mathbf{u}_{2}^{\prime}\).
\((1,0) = c_{1}(1,1) + c_{2}(2,1) \implies \begin{cases} c_1 + 2c_2 = 1 \\ c_1 + c_2 = 0 \end{cases}\)
Solving gives \(c_1 = -1, c_2 = 1\). So, \([\mathbf{u}_{1}]_{B^{\prime}} = \begin{bmatrix} -1 \\ 1 \end{bmatrix}\).
Similarly for \(\mathbf{u}_{2}\):
\((0,1) = d_{1}(1,1) + d_{2}(2,1) \implies \begin{cases} d_1 + 2d_2 = 0 \\ d_1 + d_2 = 1 \end{cases}\)
Solving gives \(d_1 = 2, d_2 = -1\). So, \([\mathbf{u}_{2}]_{B^{\prime}} = \begin{bmatrix} 2 \\ -1 \end{bmatrix}\).
Thus, \(P_{B \rightarrow B^{\prime}} = \begin{bmatrix} -1 & 2 \\ 1 & -1 \end{bmatrix}\).
Let’s use Python to solve the systems and construct \(P_{B \rightarrow B'}\).
Let \(B\) and \(B^{\prime}\) be the bases from Example 1. Use an appropriate formula to find \([\mathbf{v}]_{B}\) given that \([\mathbf{v}]_{B^{\prime}} = \begin{bmatrix} -3 \\ 5 \end{bmatrix}\).
Solution: We want to find \([\mathbf{v}]_{B}\) from \([\mathbf{v}]_{B'}\). This means we need the transition matrix \(P_{B^{\prime} \rightarrow B}\). From Example 1(a), \(P_{B^{\prime} \rightarrow B} = \begin{bmatrix} 1 & 2 \\ 1 & 1 \end{bmatrix}\).
Using the formula \([\mathbf{v}]_{B} = P_{B^{\prime} \rightarrow B}[\mathbf{v}]_{B^{\prime}}\): \[ [\mathbf{v}]_{B} = \begin{bmatrix} 1 & 2 \\ 1 & 1 \end{bmatrix} \begin{bmatrix} -3 \\ 5 \end{bmatrix} = \begin{bmatrix} (1)(-3) + (2)(5) \\ (1)(-3) + (1)(5) \end{bmatrix} = \begin{bmatrix} 7 \\ 2 \end{bmatrix} \]
So, the coordinate vector of \(\mathbf{v}\) relative to basis \(B\) is \(\begin{bmatrix} 7 \\ 2 \end{bmatrix}\).
Let’s use Python to perform the matrix multiplication.
Transition matrices are always invertible, and their inverse has a special meaning.
If \(B\) and \(B^{\prime}\) are bases for \(V\): \[ [\mathbf{v}]_{B} = P_{B^{\prime} \rightarrow B}[\mathbf{v}]_{B^{\prime}} \quad \text{and} \quad [\mathbf{v}]_{B^{\prime}} = P_{B \rightarrow B^{\prime}}[\mathbf{v}]_{B} \tag{11} \]
Substituting the second into the first: \[ [\mathbf{v}]_{B} = P_{B^{\prime} \rightarrow B}(P_{B \rightarrow B^{\prime}}[\mathbf{v}]_{B}) = (P_{B^{\prime} \rightarrow B} P_{B \rightarrow B^{\prime}})[\mathbf{v}]_{B} \] This implies that \((P_{B^{\prime} \rightarrow B} P_{B \rightarrow B^{\prime}})\) must be the identity matrix \(I\). Thus, \(P_{B^{\prime} \rightarrow B}\) and \(P_{B \rightarrow B^{\prime}}\) are inverses of each other.
THEOREM 4.6.1
If \(P\) is the transition matrix from a basis \(B^{\prime}\) to a basis \(B\) for a finite-dimensional vector space \(V\), then \(P\) is invertible and \(P^{- 1}\) is the transition matrix from \(B\) to \(B^{\prime}\). \[ P_{B^{\prime} \rightarrow B}^{-1} = P_{B \rightarrow B^{\prime}} \]
Let’s verify that \(P_{B^{\prime} \rightarrow B}\) and \(P_{B \rightarrow B^{\prime}}\) are inverses.
For \(R^n\), there’s an efficient procedure using augmented matrices. This avoids solving multiple systems separately.
A Procedure for Computing \(P_{B\rightarrow B^{\prime}}\) (Old to New)
Step 1. Form the augmented matrix \([B^{\prime} \mid B]\). (Here \(B'\) is the new basis, \(B\) is the old basis. Columns of \(B'\) and \(B\) are the basis vectors).
Step 2. Use elementary row operations to reduce the matrix in Step 1 to reduced row echelon form.
Step 3. The resulting matrix will be \([I \mid P_{B \rightarrow B^{\prime}}]\).
Step 4. Extract the matrix \(P_{B \rightarrow B^{\prime}}\) from the right side of the matrix in Step 3.
This procedure is summarized by the diagram: row operations \([ \text{new basis} \mid \text{old basis} ] \quad \to \quad [ I \mid \text{transition from old to new} ]\)
Using the bases from Example 1: \(B = \{\mathbf{u}_{1}=(1,0), \mathbf{u}_{2}=(0,1)\}\) \(B^{\prime} = \{\mathbf{u}_{1}^{\prime}=(1,1), \mathbf{u}_{2}^{\prime}=(2,1)\}\)
Here, \(B'\) is the old basis; \(B\) is the new basis. So, we want to find \(P_{B' \to B}\). According to the general formula \(P_{B' \to B} = [[\mathbf{u}_1']_B \mid [\mathbf{u}_2']_B]\). Since \(B\) is the standard basis, \([\mathbf{u}_{1}^{\prime}]_{B} = \begin{bmatrix} 1 \\ 1 \end{bmatrix}\) and \([\mathbf{u}_{2}^{\prime}]_{B} = \begin{bmatrix} 2 \\ 1 \end{bmatrix}\). Thus, \(P_{B^{\prime}\rightarrow B} = \begin{bmatrix} 1 & 2\\ 1 & 1 \end{bmatrix}\). (This was trivial as \(B\) is standard basis). The procedure (new basis | old basis) would be \([B \mid B'] = \left[ \begin{array}{l l l}{1} & 0 & {1} & {2}\\ {0} & {1} & {1} & {1} \end{array} \right]\). Since the left side is already \(I\), \(P_{B^{\prime}\rightarrow B}\) is the right side.
Here, \(B\) is the old basis; \(B'\) is the new basis. We want to find \(P_{B \rightarrow B^{\prime}}\). Form the augmented matrix \([B^{\prime} \mid B]\): \[ \left[ \begin{array}{ll|ll}1 & 2 & 1 & 0\\ 1 & 1 & 0 & 1 \end{array} \right] \] Reduce to reduced row echelon form: \[ \left[ \begin{array}{ll|cc}1 & 0 & -1 & 2\\ 0 & 1 & 1 & -1 \end{array} \right] \] So the transition matrix is \(P_{B\rightarrow B^{\prime}} = \begin{bmatrix} - 1 & 2\\ 1 & -1 \end{bmatrix}\), which matches our previous result.
Let’s use Python to perform the row operations for \(P_{B \rightarrow B'}\).
A special case arises when one of the bases is the standard basis.
THEOREM 4.6.2
Let \(B^{\prime} = \{\mathbf{u}_{1},\mathbf{u}_{2},\ldots ,\mathbf{u}_{n}\}\) be any basis for the vector space \(R^{n}\) and let \(S = \{\mathbf{e}_{1},\mathbf{e}_{2},\ldots ,\mathbf{e}_{n}\}\) be the standard basis for \(R^{n}\). If the vectors in these bases are written in column form, then: \[ P_{B^{\prime}\rightarrow S} = [\mathbf{u}_{1}\mid \mathbf{u}_{2}\mid \dots \mid \mathbf{u}_{n}] \tag{15} \] That is, the transition matrix from \(B'\) to the standard basis \(S\) is simply the matrix whose columns are the basis vectors of \(B'\).
This means if \(A = [\mathbf{u}_{1}\mid \mathbf{u}_{2}\mid \dots \mid \mathbf{u}_{n}]\) is any invertible \(n\times n\) matrix, then \(A\) can be viewed as the transition matrix from the basis \(\{\mathbf{u}_{1},\mathbf{u}_{2},\ldots ,\mathbf{u}_{n}\}\) for \(R^{n}\) to the standard basis for \(R^{n}\).
Let’s use a non-standard basis for \(R^3\) and find its transition matrix to the standard basis.
Consider the basis: \(\mathbf{u}_{1} = (1,2,1)\), \(\mathbf{u}_{2} = (2,5,0)\), \(\mathbf{u}_{3} = (3,3,8)\)
The transition matrix \(P_{B \rightarrow S}\) would simply be: \[ P_{B\rightarrow S} = \left[ \begin{array}{lll}1 & 2 & 3\\ 2 & 5 & 3\\ 1 & 0 & 8 \end{array} \right] \]