Back

Covariance and Contravariance

When tensors transform covariantly, they transform the same way that the basis vectors eμ\boldsymbol{e_{\mu}} transform. When tensors transform contravariantly, they transform the opposite way that the basis vectors transform.

As an example, consider the Cartesian coordinate system:

Bases

and we transform the bases eμe~μ\boldsymbol{e_{\mu}} \to \boldsymbol{\tilde{e}_{\mu}}:

e~μ=2eμ\boldsymbol{\tilde{e}_{\mu}} = 2 \boldsymbol{e_{\mu}}
Bases transformed

The vector has to stay invariant (xμeμ=x~μe~μ=2x~μeμx^{\mu} \boldsymbol{e_{\mu}} = \tilde{x}^{\mu} \boldsymbol{\tilde{e}_{\mu}} = 2 \tilde{x}^{\mu} \boldsymbol{e_{\mu}}). Considering the vector x\boldsymbol{x}:

xμeμ=2x~μeμ,xμ=2x~μ,x~μ=12xμ. \begin{align*} x^{\mu} \boldsymbol{e_{\mu}} &= 2 \tilde{x}^{\mu} \boldsymbol{e_{\mu}}, \\ x^{\mu} &= 2 \tilde{x}^{\mu}, \\ \tilde{x}^{\mu} &= \frac{1}{2} x^{\mu}. \end{align*}

For our vector x=2ex+3ey\boldsymbol{x} = 2 \boldsymbol{e_x} + 3 \boldsymbol{e_y}, the transformed components would be:

x~x=1,x~y=32,x=e~x+32e~y. \begin{align*} \tilde{x}^x &= 1, \\ \tilde{x}^y &= \frac{3}{2}, \\ \boldsymbol{x} &= \boldsymbol{\tilde{e}_x} + \frac{3}{2} \boldsymbol{\tilde{e}_y}. \end{align*}

In general when transforming the coordinates xμx~μx^{\mu} \to \tilde{x}^{\mu} a general tensor Tαβ...γδ...T~αβ...γδ...T^{\alpha \beta ...}{}_{\gamma \delta ...} \to \tilde{T}^{\alpha \beta ...}{}_{\gamma \delta ...} is transformed as follows:

T~αβ...γδ...=x~αxax~βxb...xcx~γxdx~δ...Tab...cd.... \tilde{T}^{\alpha \beta ...}{}_{\gamma \delta ...} = \frac{\partial \tilde{x}^{\alpha}}{\partial x^{a}} \frac{\partial \tilde{x}^{\beta}}{\partial x^{b}} ... \frac{\partial x^{c}}{\partial \tilde{x}^{\gamma}} \frac{\partial x^{d}}{\partial \tilde{x}^{\delta}} ... T^{a b ...}{}_{c d ...}.

Two notes: the partial derivatives have one upper and one lower index, so we are employing the Einstein summation convention. The second note: I am using latin and greek letters. I am using latin indices for space components and greek indices for spacetime components in general relativity. Here they are used for simplicity and don't have any special meaning.

We can make sense of the above relationship by considering a vector x\boldsymbol{x} parametrized by λ\lambda and its tangent vector dxdλ\frac{d\boldsymbol{x}}{d\lambda}. By chain rule:

dxdλ=xxμdxμdλ=dxμdλeμ.\frac{d \boldsymbol{x}}{d \lambda} = \frac{\partial \boldsymbol{x}}{\partial x^{\mu}} \frac{d x^{\mu}}{d\lambda} = \frac{d x^{\mu}}{d \lambda} \boldsymbol{e_{\mu}}.

Now, consider a transformation xμx~μx^{\mu} \to \tilde{x}^{\mu}. The tangent vector is equal to:

dxdλ=xx~μdx~μdλ=dx~μdλe~μ.\frac{d \boldsymbol{x}}{d \lambda} = \frac{\partial \boldsymbol{x}}{\partial \tilde{x}^{\mu}} \frac{d \tilde{x}^{\mu}}{d\lambda} = \frac{d \tilde{x}^{\mu}}{d \lambda} \boldsymbol{\tilde{e}_{\mu}}.

We can apply chain rule on the last part:

dx~μdλe~μ=x~μxνdxνdλe~μ,\frac{d \tilde{x}^{\mu}}{d \lambda} \boldsymbol{\tilde{e}_{\mu}} = \frac{\partial \tilde{x}^{\mu}}{\partial x^{\nu}} \frac{d x^{\nu}}{d \lambda} \boldsymbol{\tilde{e}_{\mu}},

implying the component dx~μdλ\frac{d \tilde{x}^{\mu}}{d \lambda} in the new basis e~μ\boldsymbol{\tilde{e}_{\mu}} is equal to:

dx~μdλ=x~μxνdxνdλ,\frac{d \tilde{x}^{\mu}}{d \lambda} = \frac{\partial \tilde{x}^{\mu}}{\partial x^{\nu}} \frac{d x^{\nu}}{d \lambda},

Now, consider a covector dfdf:

df=fxμdxμ=fxμϵμ.df = \frac{\partial f}{\partial x^{\mu}} dx^{\mu} = \frac{\partial f}{\partial x^{\mu}} \epsilon^{\mu}.

Now, consider a transformation xμx~μx^{\mu} \to \tilde{x}^{\mu}. The covector is now equal to:

df=fx~μdx~μ=fx~μϵ~μ.df = \frac{\partial f}{\partial \tilde{x}^{\mu}} d\tilde{x}^{\mu} = \frac{\partial f}{\partial \tilde{x}^{\mu}} \tilde{\epsilon}^{\mu}.

And similarly, we can apply chain rule on the last part:

fx~μϵ~μ=fxνxνx~μϵ~μ,\frac{\partial f}{\partial \tilde{x}^{\mu}} \tilde{\epsilon}^{\mu} = \frac{\partial f}{\partial x^{\nu}} \frac{\partial x^{\nu}}{\partial \tilde{x}^{\mu}} \tilde{\epsilon}^{\mu},

implying:

fx~μ=fxνxνx~μ.\frac{\partial f}{\partial \tilde{x}^{\mu}} = \frac{\partial f}{\partial x^{\nu}} \frac{\partial x^{\nu}}{\partial \tilde{x}^{\mu}}.

For the transformation xμx~μx^{\mu} \to \tilde{x}^{\mu}, the partial derivatives xμx~μ\frac{\partial x^{\mu}}{\partial \tilde{x}^{\mu}} form the Jacobian which is used for covariant transformation:

J=xμx~ν=[x1x~1x1x~nxnx~1xnx~n],J = \frac{\partial x^{\mu}}{\partial \tilde{x}^{\nu}} = \begin{bmatrix} \frac{\partial x^1}{\partial \tilde{x}^1} & \dots & \frac{\partial x^1}{\partial \tilde{x}^n} \\ \vdots & \ddots & \vdots \\ \frac{\partial x^n}{\partial \tilde{x}^1} & \dots & \frac{\partial x^n}{\partial \tilde{x}^n} \end{bmatrix},

And the partial derivatives x~μxμ\frac{\partial \tilde{x}^{\mu}}{\partial x^{\mu}} form the inverse jacobian used for contravariant transformations:

J1=x~μxν=[x~1x1x~1xnx~nx1x~nxn].J^{-1} = \frac{\partial \tilde{x}^{\mu}}{\partial x^{\nu}} = \begin{bmatrix} \frac{\partial \tilde{x}^1}{\partial x^1} & \dots & \frac{\partial \tilde{x}^1}{\partial x^n} \\ \vdots & \ddots & \vdots \\ \frac{\partial \tilde{x}^n}{\partial x^1} & \dots & \frac{\partial \tilde{x}^n}{\partial x^n} \end{bmatrix}.

Unlike classic derivatives, dxdf=1dfdx\frac{dx}{df} = \frac{1}{\frac{df}{dx}}, the inverse partial derivative:

xμx~ν1x~νxμ,\frac{\partial x^{\mu}}{\partial \tilde{x}^{\nu}} \neq \frac{1}{\frac{\partial \tilde{x}^{\nu}}{\partial x^{\mu}}},

however, the Jacobian satisfies the following:

J1J=x~μxνxνx~σ=x~μx~σ=δσμ={1μ=σ,0μσ. \begin{align*} J^{-1}J &= \frac{\partial \tilde{x}^{\mu}}{\partial x^{\nu}} \frac{\partial x^{\nu}}{\partial \tilde{x}^{\sigma}} \\ &= \frac{\partial \tilde{x}^{\mu}}{\partial \tilde{x}^{\sigma}} \\ &= \delta^{\mu}_{\sigma} = \begin{cases} 1 & \mu &= \sigma, \\ 0 & \mu &\neq \sigma. \end{cases} \end{align*}