Back

Covariant Derivative

Covariant derivative is a way to understand the rate of change of tensor fields that takes into account changing basis vectors.

In flat space, the covariant derivative of a vector v\boldsymbol{v} is the ordinary derivative of vector where we make sure to take derivatives of basis vectors:

vxμ=xμ(vνeν)=vνxμeν+vνeνxμ. \begin{align*} \frac{\partial \boldsymbol{v}}{\partial x^{\mu}} &= \frac{\partial}{\partial x^{\mu}} (v^{\nu} \boldsymbol{e_{\nu}}) \\ &= \frac{\partial v^{\nu}}{\partial x^{\mu}} \boldsymbol{e_{\nu}} + v^{\nu} \frac{\partial \boldsymbol{e_{\nu}}}{\partial x^{\mu}}. \end{align*}

In the previous chapter, we expressed partial derivative of basis vectors as follows:

eνxμ=Γσμνeσ+Lμνn^,\frac{\partial \boldsymbol{e_{\nu}}}{\partial x^{\mu}} = \Gamma^{\sigma}{}_{\mu \nu} \boldsymbol{e_{\sigma}} + L_{\mu \nu} \boldsymbol{\hat{n}},

since we are dealing with flat space, the normal components are zero:

eνxμ=Γσμνeσ,\frac{\partial \boldsymbol{e_{\nu}}}{\partial x^{\mu}} = \Gamma^{\sigma}{}_{\mu \nu} \boldsymbol{e_{\sigma}},

substituting into the previous equation:

vxμ=vνxμeν+vνΓσμνeσ=vσxμeσ+vνΓσμνeσ=(vσxμ+vνΓσμν)eσ, \begin{align*} \frac{\partial \boldsymbol{v}}{\partial x^{\mu}} &= \frac{\partial v^{\nu}}{\partial x^{\mu}} \boldsymbol{e_{\nu}} + v^{\nu} \Gamma^{\sigma}{}_{\mu \nu} \boldsymbol{e_{\sigma}} \\ &= \frac{\partial v^{\sigma}}{\partial x^{\mu}} \boldsymbol{e_{\sigma}} + v^{\nu} \Gamma^{\sigma}{}_{\mu \nu} \boldsymbol{e_{\sigma}} \\ &= \left(\frac{\partial v^{\sigma}}{\partial x^{\mu}} + v^{\nu} \Gamma^{\sigma}{}_{\mu \nu}\right) \boldsymbol{e_{\sigma}}, \end{align*}

or, expressing as components:

μvσ=vσ;μ=vσxμ+vνΓσμν,\nabla_{\mu} v^{\sigma} = v^{\sigma}{}_{;\mu} = \frac{\partial v^{\sigma}}{\partial x^{\mu}} + v^{\nu} \Gamma^{\sigma}{}_{\mu \nu},

where μvσ=vσ;μ\nabla_{\mu} v^{\sigma} = v^{\sigma}{}_{;\mu} is the covariant derivative of the σ\sigma component of the vector v\boldsymbol{v} in the direction of the μ\mu coordinate.

As an example, consider the following vector field in Cartesian coordinates:

v=12ex+12ey.\boldsymbol{v} = \frac{1}{2} \boldsymbol{e_x} + \frac{1}{2} \boldsymbol{e_y}.
Vector field in Cartesian coordinates

The covariant derivative components are equal to zero:

vσ;μ=vσxμ+vνΓσμν=0+vν0=0. \begin{align*} v^{\sigma}{}_{;\mu} &= \frac{\partial v^{\sigma}}{\partial x^{\mu}} + v^{\nu} \Gamma^{\sigma}{}_{\mu \nu} \\ &= 0 + v^{\nu} \cdot 0 \\ &= 0. \end{align*}

But when we consider the following vector field with same components but in polar coordinates:

v=12er+12er.\boldsymbol{v} = \frac{1}{2} \boldsymbol{e_r} + \frac{1}{2} \boldsymbol{e_r}.
Vector field in polar coordinates

Recall the nonzero Christoffel symbols:

Γθθr=r,Γrθθ=Γθrθ=1r, \begin{align*} \Gamma^r_{\theta \theta} &= -r, \\ \Gamma^{\theta}_{r \theta} = \Gamma^{\theta}_{\theta r} &= \frac{1}{r}, \end{align*}

and the nonzero covariant derivative components are equal to:

vσ;μ=vνΓσμν,vr;θ=12Γrθθ=r2,vθ;θ=vθ;r=12r. \begin{align*} v^{\sigma}{}_{;\mu} &= v^{\nu} \Gamma^{\sigma}{}_{\mu \nu}, \\ v^r{}_{;\theta} &= \frac{1}{2} \Gamma^r{}_{\theta \theta} \\ &= -\frac{r}{2}, \\ v^{\theta}{}_{;\theta} = v^{\theta}{}_{;r} &= \frac{1}{2r}. \end{align*}

So we can see, that constant components does not imply constant vector field.

The covariant derivative transforms like a tensor:

vxμ=(vσxμ+vνΓσμν)eσ,x~λxμvx~λ=(vσxμ+vνΓσμν)x~λxσe~λ,xμx~αx~λxμvx~λ=xμx~α(vσxμ+vνΓσμν)x~λxσe~λ,δαλvx~λ=xμx~αx~λxσ(vσxμ+vνΓσμν)e~λ,vx~α=xμx~αx~λxσvσ;μe~λ,v~λ;α=xμx~αx~λxσvσ;μ. \begin{align*} \frac{\partial \boldsymbol{v}}{\partial x^{\mu}} &= \left(\frac{\partial v^{\sigma}}{\partial x^{\mu}} + v^{\nu} \Gamma^{\sigma}{}_{\mu \nu}\right) \boldsymbol{e_{\sigma}}, \\ \frac{\partial \tilde{x}^{\lambda}}{\partial x^{\mu}} \frac{\partial \boldsymbol{v}}{\partial \tilde{x}^{\lambda}} &= \left(\frac{\partial v^{\sigma}}{\partial x^{\mu}} + v^{\nu} \Gamma^{\sigma}{}_{\mu \nu}\right) \frac{\partial \tilde{x}^{\lambda}}{\partial x^{\sigma}} \boldsymbol{\tilde{e}_{\lambda}}, \\ \frac{\partial x^{\mu}}{\partial \tilde{x}^{\alpha}} \frac{\partial \tilde{x}^{\lambda}}{\partial x^{\mu}} \frac{\partial \boldsymbol{v}}{\partial \tilde{x}^{\lambda}} &= \frac{\partial x^{\mu}}{\partial \tilde{x}^{\alpha}} \left(\frac{\partial v^{\sigma}}{\partial x^{\mu}} + v^{\nu} \Gamma^{\sigma}{}_{\mu \nu}\right) \frac{\partial \tilde{x}^{\lambda}}{\partial x^{\sigma}} \boldsymbol{\tilde{e}_{\lambda}}, \\ \delta^{\lambda}_{\alpha} \frac{\partial \boldsymbol{v}}{\partial \tilde{x}^{\lambda}} &= \frac{\partial x^{\mu}}{\partial \tilde{x}^{\alpha}} \frac{\partial \tilde{x}^{\lambda}}{\partial x^{\sigma}} \left(\frac{\partial v^{\sigma}}{\partial x^{\mu}} + v^{\nu} \Gamma^{\sigma}{}_{\mu \nu}\right) \boldsymbol{\tilde{e}_{\lambda}}, \\ \frac{\partial \boldsymbol{v}}{\partial \tilde{x}^{\alpha}} &= \frac{\partial x^{\mu}}{\partial \tilde{x}^{\alpha}} \frac{\partial \tilde{x}^{\lambda}}{\partial x^{\sigma}} v^{\sigma}{}_{;\mu} \boldsymbol{\tilde{e}_{\lambda}}, \\ \tilde{v}^{\lambda}{}_{;\alpha} &= \frac{\partial x^{\mu}}{\partial \tilde{x}^{\alpha}} \frac{\partial \tilde{x}^{\lambda}}{\partial x^{\sigma}} v^{\sigma}{}_{;\mu}. \\ \end{align*}

If we compute covariant derivative in the transfromed basis, we get:

v~λ;α=v~λx~α+v~νΓ~λαν, \begin{align*} \tilde{v}^{\lambda}{}_{;\alpha} &= \frac{\partial \tilde{v}^{\lambda}}{\partial \tilde{x}^{\alpha}} + \tilde{v}^{\nu} \tilde{\Gamma}^{\lambda}{}_{\alpha \nu}, \end{align*}

and setting it equal with the previous equation:

v~λx~α+v~νΓ~λαν=xμx~αx~λxσ(vσxμ+vνΓσμν)=xμx~αx~λxσ(xμ(xσx~βv~β)+xνx~γv~γΓσμν)=xμx~αx~λxσ(x~ρxμx~ρ(xσx~βv~β)+xνx~γv~γΓσμν)=x~λxσ(xμx~αx~ρxμ(2xσx~ρx~βv~β+xσx~βv~βx~ρ)+xμx~αxνx~γv~γΓσμν)=x~λxσ(δαρ(2xσx~ρx~βv~β+xσx~βv~βx~ρ)+xμx~αxνx~γv~γΓσμν)=x~λxσ(2xσx~αx~βv~β+xσx~βv~βx~α+xμx~αxνx~γv~γΓσμν)=x~λxσ2xσx~αx~βv~β+x~λxσxσx~βv~βx~α+x~λxσxμx~αxνx~γv~γΓσμν=x~λxσ2xσx~αx~βv~β+δβλv~βx~α+x~λxσxμx~αxνx~γv~γΓσμν=x~λxσ2xσx~αx~βv~β+v~λx~α+x~λxσxμx~αxνx~βv~βΓσμν=x~λxσv~β(2xσx~αx~β+xμx~αxνx~βΓσμν)+v~λx~α,v~βΓ~λαβ=x~λxσv~β(2xσx~αx~β+xμx~αxνx~βΓσμν),Γ~λαβ=x~λxσ(2xσx~αx~β+xμx~αxνx~βΓσμν)=x~λxσxμx~αxνx~βΓσμν+x~λxσ2xσx~αx~β, \begin{align*} \frac{\partial \tilde{v}^{\lambda}}{\partial \tilde{x}^{\alpha}} + \tilde{v}^{\nu} \tilde{\Gamma}^{\lambda}{}_{\alpha \nu} &= \frac{\partial x^{\mu}}{\partial \tilde{x}^{\alpha}} \frac{\partial \tilde{x}^{\lambda}}{\partial x^{\sigma}} \left(\frac{\partial v^{\sigma}}{\partial x^{\mu}} + v^{\nu} \Gamma^{\sigma}{}_{\mu \nu}\right) \\ &= \frac{\partial x^{\mu}}{\partial \tilde{x}^{\alpha}} \frac{\partial \tilde{x}^{\lambda}}{\partial x^{\sigma}} \left(\frac{\partial}{\partial x^{\mu}} \left(\frac{\partial x^{\sigma}}{\partial \tilde{x}^{\beta}} \tilde{v}^{\beta}\right) + \frac{\partial x^{\nu}}{\partial \tilde{x}^{\gamma}} \tilde{v}^{\gamma} \Gamma^{\sigma}{}_{\mu \nu}\right) \\ &= \frac{\partial x^{\mu}}{\partial \tilde{x}^{\alpha}} \frac{\partial \tilde{x}^{\lambda}}{\partial x^{\sigma}} \left(\frac{\partial \tilde{x}^{\rho}}{\partial x^{\mu}} \frac{\partial}{\partial \tilde{x}^{\rho}} \left(\frac{\partial x^{\sigma}}{\partial \tilde{x}^{\beta}} \tilde{v}^{\beta}\right) + \frac{\partial x^{\nu}}{\partial \tilde{x}^{\gamma}} \tilde{v}^{\gamma} \Gamma^{\sigma}{}_{\mu \nu}\right) \\ &= \frac{\partial \tilde{x}^{\lambda}}{\partial x^{\sigma}} \left(\frac{\partial x^{\mu}}{\partial \tilde{x}^{\alpha}} \frac{\partial \tilde{x}^{\rho}}{\partial x^{\mu}} \left(\frac{\partial^2 x^{\sigma}}{\partial \tilde{x}^{\rho} \partial \tilde{x}^{\beta}} \tilde{v}^{\beta} + \frac{\partial x^{\sigma}}{\partial \tilde{x}^{\beta}} \frac{\partial \tilde{v}^{\beta}}{\partial \tilde{x}^{\rho}}\right) + \frac{\partial x^{\mu}}{\partial \tilde{x}^{\alpha}} \frac{\partial x^{\nu}}{\partial \tilde{x}^{\gamma}} \tilde{v}^{\gamma} \Gamma^{\sigma}{}_{\mu \nu}\right) \\ &= \frac{\partial \tilde{x}^{\lambda}}{\partial x^{\sigma}} \left(\delta^{\rho}_{\alpha} \left(\frac{\partial^2 x^{\sigma}}{\partial \tilde{x}^{\rho} \partial \tilde{x}^{\beta}} \tilde{v}^{\beta} + \frac{\partial x^{\sigma}}{\partial \tilde{x}^{\beta}} \frac{\partial \tilde{v}^{\beta}}{\partial \tilde{x}^{\rho}}\right) + \frac{\partial x^{\mu}}{\partial \tilde{x}^{\alpha}} \frac{\partial x^{\nu}}{\partial \tilde{x}^{\gamma}} \tilde{v}^{\gamma} \Gamma^{\sigma}{}_{\mu \nu}\right) \\ &= \frac{\partial \tilde{x}^{\lambda}}{\partial x^{\sigma}} \left(\frac{\partial^2 x^{\sigma}}{\partial \tilde{x}^{\alpha} \partial \tilde{x}^{\beta}} \tilde{v}^{\beta} + \frac{\partial x^{\sigma}}{\partial \tilde{x}^{\beta}} \frac{\partial \tilde{v}^{\beta}}{\partial \tilde{x}^{\alpha}} + \frac{\partial x^{\mu}}{\partial \tilde{x}^{\alpha}} \frac{\partial x^{\nu}}{\partial \tilde{x}^{\gamma}} \tilde{v}^{\gamma} \Gamma^{\sigma}{}_{\mu \nu}\right) \\ &= \frac{\partial \tilde{x}^{\lambda}}{\partial x^{\sigma}} \frac{\partial^2 x^{\sigma}}{\partial \tilde{x}^{\alpha} \partial \tilde{x}^{\beta}} \tilde{v}^{\beta} + \frac{\partial \tilde{x}^{\lambda}}{\partial x^{\sigma}} \frac{\partial x^{\sigma}}{\partial \tilde{x}^{\beta}} \frac{\partial \tilde{v}^{\beta}}{\partial \tilde{x}^{\alpha}} + \frac{\partial \tilde{x}^{\lambda}}{\partial x^{\sigma}} \frac{\partial x^{\mu}}{\partial \tilde{x}^{\alpha}} \frac{\partial x^{\nu}}{\partial \tilde{x}^{\gamma}} \tilde{v}^{\gamma} \Gamma^{\sigma}{}_{\mu \nu} \\ &= \frac{\partial \tilde{x}^{\lambda}}{\partial x^{\sigma}} \frac{\partial^2 x^{\sigma}}{\partial \tilde{x}^{\alpha} \partial \tilde{x}^{\beta}} \tilde{v}^{\beta} + \delta^{\lambda}_{\beta} \frac{\partial \tilde{v}^{\beta}}{\partial \tilde{x}^{\alpha}} + \frac{\partial \tilde{x}^{\lambda}}{\partial x^{\sigma}} \frac{\partial x^{\mu}}{\partial \tilde{x}^{\alpha}} \frac{\partial x^{\nu}}{\partial \tilde{x}^{\gamma}} \tilde{v}^{\gamma} \Gamma^{\sigma}{}_{\mu \nu} \\ &= \frac{\partial \tilde{x}^{\lambda}}{\partial x^{\sigma}} \frac{\partial^2 x^{\sigma}}{\partial \tilde{x}^{\alpha} \partial \tilde{x}^{\beta}} \tilde{v}^{\beta} + \frac{\partial \tilde{v}^{\lambda}}{\partial \tilde{x}^{\alpha}} + \frac{\partial \tilde{x}^{\lambda}}{\partial x^{\sigma}} \frac{\partial x^{\mu}}{\partial \tilde{x}^{\alpha}} \frac{\partial x^{\nu}}{\partial \tilde{x}^{\beta}} \tilde{v}^{\beta} \Gamma^{\sigma}{}_{\mu \nu} \\ &= \frac{\partial \tilde{x}^{\lambda}}{\partial x^{\sigma}} \tilde{v}^{\beta} \left(\frac{\partial^2 x^{\sigma}}{\partial \tilde{x}^{\alpha} \partial \tilde{x}^{\beta}} + \frac{\partial x^{\mu}}{\partial \tilde{x}^{\alpha}} \frac{\partial x^{\nu}}{\partial \tilde{x}^{\beta}} \Gamma^{\sigma}{}_{\mu \nu}\right) + \frac{\partial \tilde{v}^{\lambda}}{\partial \tilde{x}^{\alpha}}, \\ \tilde{v}^{\beta} \tilde{\Gamma}^{\lambda}{}_{\alpha \beta} &= \frac{\partial \tilde{x}^{\lambda}}{\partial x^{\sigma}} \tilde{v}^{\beta} \left(\frac{\partial^2 x^{\sigma}}{\partial \tilde{x}^{\alpha} \partial \tilde{x}^{\beta}} + \frac{\partial x^{\mu}}{\partial \tilde{x}^{\alpha}} \frac{\partial x^{\nu}}{\partial \tilde{x}^{\beta}} \Gamma^{\sigma}{}_{\mu \nu}\right), \\ \tilde{\Gamma}^{\lambda}{}_{\alpha \beta} &= \frac{\partial \tilde{x}^{\lambda}}{\partial x^{\sigma}} \left(\frac{\partial^2 x^{\sigma}}{\partial \tilde{x}^{\alpha} \partial \tilde{x}^{\beta}} + \frac{\partial x^{\mu}}{\partial \tilde{x}^{\alpha}} \frac{\partial x^{\nu}}{\partial \tilde{x}^{\beta}} \Gamma^{\sigma}{}_{\mu \nu}\right) \\ &= \frac{\partial \tilde{x}^{\lambda}}{\partial x^{\sigma}} \frac{\partial x^{\mu}}{\partial \tilde{x}^{\alpha}} \frac{\partial x^{\nu}}{\partial \tilde{x}^{\beta}} \Gamma^{\sigma}{}_{\mu \nu} + \frac{\partial \tilde{x}^{\lambda}}{\partial x^{\sigma}} \frac{\partial^2 x^{\sigma}}{\partial \tilde{x}^{\alpha} \partial \tilde{x}^{\beta}}, \end{align*}

we see, that the Christoffel symbols are not tensor.

When we want to compare two vectors in flat space, we just slide one over to the other:

Flat transport 1Flat transport 2

And we could naively do this on the surface of the sphere:

Sphere transport 1Sphere transport 2

but what we did here is transport it through the higher dimensional (3D) space. If we look at the two vectors from the perspective of someone living on the surface:

Sphere vector 1Sphere vector 2

we see that they are two different vector. What we need to do is a process called parallel transport:

Parallel transport

There is a downside to parallel transport. When parallel transporting along a closed curve, the vector may rotate:

Parallel transport rotation

When parallel transporting, the rate of change of the vector is completely in the normal components:

Parallel transport normal components

so we may express it as follows when transporting along a path parametrized by λ\lambda:

dvdλ=n,dvdλn=0 \begin{align*} \frac{d \boldsymbol{v}}{d\lambda} &= \boldsymbol{n}, \\ \frac{d \boldsymbol{v}}{d\lambda} - \boldsymbol{n} &= \boldsymbol{0} \\ \end{align*}

For the extrinsic definition of covariant derivative wv\nabla_{\boldsymbol{w}} \boldsymbol{v}, it is the rate of change of vector field v\boldsymbol{v} in a direction of w\boldsymbol{w} with the normal component subtracted. For the parallel transported vector, this means:

ddλv=dvdλn=0,\nabla_{\frac{d}{d\lambda}} \boldsymbol{v} = \frac{d\boldsymbol{v}}{d\lambda} - \boldsymbol{n} = \boldsymbol{0},

so when the covariant derivative is equal to zero, it means that the vector field is parallel transported.

The covariant derivative of tangent vector v\boldsymbol{v} in the direction of the xμx^{\mu} coordinate is equal to:

xμv=vxμn=xμ(vνeν)n=vνxμeν+vνeνxμn=vνxμeν+vν(Γσμνeσ+Lμνn^)n=vσxμeσ+vνΓσμνeσ+vνLμνn^n=(vσxμ+vνΓσμν)eσ+vνLμνn^n, \begin{align*} \nabla_{\frac{\partial}{\partial x^{\mu}}} \boldsymbol{v} &= \frac{\partial \boldsymbol{v}}{\partial x^{\mu}} - \boldsymbol{n} \\ &= \frac{\partial}{\partial x^{\mu}} \left(v^{\nu} \boldsymbol{e_{\nu}}\right) - \boldsymbol{n} \\ &= \frac{\partial v^{\nu}}{\partial x^{\mu}} \boldsymbol{e_{\nu}} + v^{\nu} \frac{\partial \boldsymbol{e_{\nu}}}{\partial x^{\mu}} - \boldsymbol{n} \\ &= \frac{\partial v^{\nu}}{\partial x^{\mu}} \boldsymbol{e_{\nu}} + v^{\nu} \left(\Gamma^{\sigma}{}_{\mu \nu} \boldsymbol{e_{\sigma}} + L_{\mu \nu} \boldsymbol{\hat{n}}\right) - \boldsymbol{n} \\ &= \frac{\partial v^{\sigma}}{\partial x^{\mu}} \boldsymbol{e_{\sigma}} + v^{\nu} \Gamma^{\sigma}{}_{\mu \nu} \boldsymbol{e_{\sigma}} + v^{\nu} L_{\mu \nu} \boldsymbol{\hat{n}} - \boldsymbol{n} \\ &= \left(\frac{\partial v^{\sigma}}{\partial x^{\mu}} + v^{\nu} \Gamma^{\sigma}{}_{\mu \nu}\right) \boldsymbol{e_{\sigma}} + v^{\nu} L_{\mu \nu} \boldsymbol{\hat{n}} - \boldsymbol{n}, \end{align*}

since we said we are subtracting the normal components, they cancel:

xμv=μv=(vσxμ+vνΓσμν)eσ,\nabla_{\frac{\partial}{\partial x^{\mu}}} \boldsymbol{v} = \nabla_{\mu} \boldsymbol{v} = \left(\frac{\partial v^{\sigma}}{\partial x^{\mu}} + v^{\nu} \Gamma^{\sigma}{}_{\mu \nu}\right) \boldsymbol{e_{\sigma}},

or in component form:

μvσ=vσ;μ=vσxμ+vνΓσμν,\nabla_{\mu} v^{\sigma} = v^{\sigma}{}_{;\mu} = \frac{\partial v^{\sigma}}{\partial x^{\mu}} + v^{\nu} \Gamma^{\sigma}{}_{\mu \nu},

and these equations still apply in flat space since we didn't have normal components to begin with.

As an example, consider the intrinsic sphere with coordinates θ\theta and ϕ\phi, where the metric and the nonzero Christoffel symbols are equal to:

gμν=[r200r2sin2θ],Γθϕϕ=sinθcosθ=12sin(2θ),Γϕθϕ=Γϕϕθ=cotθ. \begin{align*} g_{\mu \nu} &= \begin{bmatrix} r^2 & 0 \\ 0 & r^2 \sin^2 \theta \end{bmatrix}, \\ \Gamma^{\theta}{}_{\phi \phi} &= -\sin \theta \cos \theta = -\frac{1}{2} \sin (2 \theta), \\ \Gamma^{\phi}{}_{\theta \phi} = \Gamma^{\phi}{}_{\phi \theta} &= \cot \theta. \end{align*}

And the covariant derivatives of arbitrary vector field v\boldsymbol{v} are equal to:

vθ;θ=vθθ+vνΓθθν=vθθ,vθ;ϕ=vθϕ+vνΓθϕν=vθϕ+vϕΓθϕϕ=vθϕvϕ2sin(2θ),vϕ;θ=vϕθ+vνΓϕθν=vϕθ+vϕΓϕθϕ=vϕθ+vϕcotθ,vϕ;ϕ=vϕϕ+vνΓϕϕν=vϕϕ+vθΓϕϕθ=vϕϕ+vθcotθ, \begin{align*} v^{\theta}{}_{;\theta} &= \frac{\partial v^{\theta}}{\partial \theta} + v^{\nu} \Gamma^{\theta}{}_{\theta \nu} \\ &= \frac{\partial v^{\theta}}{\partial \theta}, \\ v^{\theta}{}_{;\phi} &= \frac{\partial v^{\theta}}{\partial \phi} + v^{\nu} \Gamma^{\theta}{}_{\phi \nu} \\ &= \frac{\partial v^{\theta}}{\partial \phi} + v^{\phi} \Gamma^{\theta}{}_{\phi \phi} \\ &= \frac{\partial v^{\theta}}{\partial \phi} - \frac{v^{\phi}}{2} \sin (2\theta), \\ v^{\phi}{}_{;\theta} &= \frac{\partial v^{\phi}}{\partial \theta} + v^{\nu} \Gamma^{\phi}{}_{\theta \nu} \\ &= \frac{\partial v^{\phi}}{\partial \theta} + v^{\phi} \Gamma^{\phi}{}_{\theta \phi} \\ &= \frac{\partial v^{\phi}}{\partial \theta} + v^{\phi} \cot \theta, \\ v^{\phi}{}_{;\phi} &= \frac{\partial v^{\phi}}{\partial \phi} + v^{\nu} \Gamma^{\phi}{}_{\phi \nu} \\ &= \frac{\partial v^{\phi}}{\partial \phi} + v^{\theta} \Gamma^{\phi}{}_{\phi \theta} \\ &= \frac{\partial v^{\phi}}{\partial \phi} + v^{\theta} \cot \theta, \\ \end{align*}

cleaning it up:

vθ;θ=vθθ,vθ;ϕ=vθϕvϕ2sin(2θ),vϕ;θ=vϕθ+vϕcotθ,vϕ;ϕ=vϕϕ+vθcotθ. \begin{align*} v^{\theta}{}_{;\theta} &= \frac{\partial v^{\theta}}{\partial \theta}, \\ v^{\theta}{}_{;\phi} &= \frac{\partial v^{\theta}}{\partial \phi} - \frac{v^{\phi}}{2} \sin (2\theta), \\ v^{\phi}{}_{;\theta} &= \frac{\partial v^{\phi}}{\partial \theta} + v^{\phi} \cot \theta, \\ v^{\phi}{}_{;\phi} &= \frac{\partial v^{\phi}}{\partial \phi} + v^{\theta} \cot \theta. \end{align*}

Consider the vector field v(θ,ϕ)=eθ\boldsymbol{v}(\theta, \phi) = \boldsymbol{e_{\theta}} along the equator (θ=π2\theta = \frac{\pi}{2}, ϕ=λ\phi = \lambda):

Vector field along equator

The covariant derivatives are:

vθ;θ=0,vθ;ϕ=vϕ2sin(2θ),=0,vϕ;θ=vϕcotθ=0,vϕ;ϕ=vθcotθ=cotπ2=0, \begin{align*} v^{\theta}{}_{;\theta} &= 0, \\ v^{\theta}{}_{;\phi} &= - \frac{v^{\phi}}{2} \sin (2\theta), \\ &= 0, \\ v^{\phi}{}_{;\theta} &= v^{\phi} \cot \theta \\ &= 0, \\ v^{\phi}{}_{;\phi} &= v^{\theta} \cot \theta \\ &= \cot \frac{\pi}{2} \\ &= 0, \end{align*}

meaning the vector field is a parallel transport of a vector.

We checked if a vector is parallely transported in a vector field. If we want to do the opposite, we specify a vector and curve and demand that the covariant derivative is zero. As an example consider the vector v0=eθ\boldsymbol{v_0} = \boldsymbol{e_{\theta}} along a curve where θ=π4\theta = \frac{\pi}{4} and ϕ=λ\phi = \lambda. We are looking for a vector field v\boldsymbol{v} where a covariant derivative is zero:

ddλv=dvdλn=(dθdλθ+dϕdλϕ)vn=vϕn=ϕv=0, \begin{align*} \nabla_{\frac{d}{d\lambda}} \boldsymbol{v} &= \frac{d \boldsymbol{v}}{d\lambda} - \boldsymbol{n} \\ &= \left(\frac{d \theta}{d \lambda} \frac{\partial}{\partial \theta} + \frac{d \phi}{d \lambda} \frac{\partial}{\partial \phi}\right) \boldsymbol{v} - \boldsymbol{n} \\ &= \frac{\partial \boldsymbol{v}}{\partial \phi} - \boldsymbol{n} \\ &= \nabla_{\phi} \boldsymbol{v} = \boldsymbol{0}, \end{align*}

implying:

vθ;ϕ=0,vϕ;ϕ=0. \begin{align*} v^{\theta}{}_{;\phi} &= 0, \\ v^{\phi}{}_{;\phi} &= 0. \end{align*}

Computing the covariant derivatives:

vθ;ϕ=vθϕvϕ2sin(2θ)=vθλvϕ2=0,vϕ;ϕ=vϕϕ+vθcotθ=vϕλ+vθ=0. \begin{align*} v^{\theta}{}_{;\phi} &= \frac{\partial v^{\theta}}{\partial \phi} - \frac{v^{\phi}}{2} \sin (2\theta) \\ &= \frac{\partial v^{\theta}}{\partial \lambda} - \frac{v^{\phi}}{2} = 0, \\ v^{\phi}{}_{;\phi} &= \frac{\partial v^{\phi}}{\partial \phi} + v^{\theta} \cot \theta \\ &= \frac{\partial v^{\phi}}{\partial \lambda} + v^{\theta} = 0. \end{align*}

We have system of partial differential equations describing the vector field:

vθλ=vϕ2,vϕλ=vθ, \begin{align*} \frac{\partial v^{\theta}}{\partial \lambda} &= \frac{v^{\phi}}{2}, \\ \frac{\partial v^{\phi}}{\partial \lambda} &= -v^{\theta}, \end{align*}

we can take partial derivative with respect to λ\lambda of both equations:

2vθλ2=12vϕλ,2vϕλ2=vθλ, \begin{align*} \frac{\partial^2 v^{\theta}}{\partial \lambda^2} &= \frac{1}{2} \frac{\partial v^{\phi}}{\partial \lambda}, \\ \frac{\partial^2 v^{\phi}}{\partial \lambda^2} &= - \frac{\partial v^{\theta}}{\partial \lambda}, \end{align*}

and solve for the first order partial derivatives:

vϕλ=22vθλ2,vθλ=2vϕλ2, \begin{align*} \frac{\partial v^{\phi}}{\partial \lambda} &= 2 \frac{\partial^2 v^{\theta}}{\partial \lambda^2}, \\ \frac{\partial v^{\theta}}{\partial \lambda} &= -\frac{\partial^2 v^{\phi}}{\partial \lambda^2}, \end{align*}

substituting back:

2vθλ2=vθ2,2vϕλ2=vϕ2. \begin{align*} \frac{\partial^2 v^{\theta}}{\partial \lambda^2} &= -\frac{v^{\theta}}{2}, \\ \frac{\partial^2 v^{\phi}}{\partial \lambda^2} &= -\frac{v^{\phi}}{2}. \end{align*}

The solution to these equations are:

vθ=Asinλ2+Bcosλ2,vϕ=Csinλ2+Dcosλ2. \begin{align*} v^{\theta} &= A \sin \frac{\lambda}{\sqrt{2}} + B \cos \frac{\lambda}{\sqrt{2}}, \\ v^{\phi} &= C \sin \frac{\lambda}{\sqrt{2}} + D \cos \frac{\lambda}{\sqrt{2}}. \end{align*}

When we substitute λ=0\lambda = 0 into the solution, the vector field v=v0=eθ\boldsymbol{v} = \boldsymbol{v_0} = \boldsymbol{e_{\theta}} is just the vector we want to transport and we get:

vθ=B=1,vϕ=D=0, \begin{align*} v^{\theta} &= B = 1, \\ v^{\phi} &= D = 0, \end{align*}

implying:

vθ=Asinλ2+cosλ2,vϕ=Csinλ2. \begin{align*} v^{\theta} &= A \sin \frac{\lambda}{\sqrt{2}} + \cos \frac{\lambda}{\sqrt{2}}, \\ v^{\phi} &= C \sin \frac{\lambda}{\sqrt{2}}. \end{align*}

The derivatives are equal to:

vθλ=A2cosλ212sinλ2,vϕλ=C2cosλ2. \begin{align*} \frac{\partial v^{\theta}}{\partial \lambda} &= \frac{A}{\sqrt{2}} \cos \frac{\lambda}{\sqrt{2}} - \frac{1}{\sqrt{2}} \sin \frac{\lambda}{\sqrt{2}}, \\ \frac{\partial v^{\phi}}{\partial \lambda} &= \frac{C}{\sqrt{2}} \cos \frac{\lambda}{\sqrt{2}}. \end{align*}

We already have formula for the derivatives:

vθλ=vϕ2=C2sinλ2,vϕλ=vθ=(Asinλ2+cosλ2), \begin{align*} \frac{\partial v^{\theta}}{\partial \lambda} &= \frac{v^{\phi}}{2} = \frac{C}{2} \sin \frac{\lambda}{\sqrt{2}}, \\ \frac{\partial v^{\phi}}{\partial \lambda} &= -v^{\theta} = -(A \sin \frac{\lambda}{\sqrt{2}} + \cos \frac{\lambda}{\sqrt{2}}), \end{align*}

and substituting vθv^{\theta} and vϕv^{\phi}:

A2cosλ212sinλ2=C2sinλ2,C2cosλ2=(Asinλ2+cosλ2). \begin{align*} \frac{A}{\sqrt{2}} \cos \frac{\lambda}{\sqrt{2}} - \frac{1}{\sqrt{2}} \sin \frac{\lambda}{\sqrt{2}} &= \frac{C}{2} \sin \frac{\lambda}{\sqrt{2}}, \\ \frac{C}{\sqrt{2}} \cos \frac{\lambda}{\sqrt{2}} &= -(A \sin \frac{\lambda}{\sqrt{2}} + \cos \frac{\lambda}{\sqrt{2}}). \end{align*}

We can substitute λ=0\lambda = 0:

A2=0,C2=1, \begin{align*} \frac{A}{\sqrt{2}} &= 0, \\ \frac{C}{\sqrt{2}} &= -1, \end{align*}

and solve for the constants:

A=0,C=2. \begin{align*} A &= 0, \\ C &= -\sqrt{2}. \end{align*}

So the equations are as follows:

vθ=cosλ2,vϕ=2sinλ2, \begin{align*} v^{\theta} &= \cos \frac{\lambda}{\sqrt{2}}, \\ v^{\phi} &= -\sqrt{2} \sin \frac{\lambda}{\sqrt{2}}, \end{align*}

and the vector field:

v(θ,ϕ)=cosλ2eθ2sinλ2eϕ.\boldsymbol{v}(\theta, \phi) = \cos \frac{\lambda}{\sqrt{2}} \boldsymbol{e_{\theta}} - \sqrt{2} \sin \frac{\lambda}{\sqrt{2}} \boldsymbol{e_{\phi}}.
Parallel transport of the vector

The squared length of the initial vector v0\boldsymbol{v_0} is equal to:

v02=gμνv0μv0ν=r2. \begin{align*} |\boldsymbol{v_0}|^2 &= g_{\mu \nu} v_0^{\mu} v_0^{\nu} \\ &= r^2. \end{align*}

And for the vector field v\boldsymbol{v}:

v2=gμνvμvν=(cosλ2)2r2+(2sinλ2)2r2sin2θ=r2(cos2λ2r2+2sin2λ2sin2π4)2=r2(cos2λ2r2+sin2λ2)2=r2, \begin{align*} |\boldsymbol{v}|^2 &= g_{\mu \nu} v^{\mu} v^{\nu} \\ &= \left(\cos \frac{\lambda}{\sqrt{2}}\right)^2 r^2 + \left(-\sqrt{2} \sin \frac{\lambda}{\sqrt{2}}\right)^2 r^2 \sin^2 \theta \\ &= r^2 \left(\cos^2 \frac{\lambda}{\sqrt{2}} r^2 + 2 \sin^2 \frac{\lambda}{\sqrt{2}} \sin^2 \frac{\pi}{4}\right)^2 \\ &= r^2 \left(\cos^2 \frac{\lambda}{\sqrt{2}} r^2 + \sin^2 \frac{\lambda}{\sqrt{2}}\right)^2 \\ &= r^2, \end{align*}

Keeping the vector length constant.

In intrinsic geometry, the covariant derivative is the same:

xμv=μv=(vσxμ+vνΓσμν)eσ,μvσ=vσ;μ=vσxμ+vνΓσμν, \begin{align*} \nabla_{\frac{\partial}{\partial x^{\mu}}} \boldsymbol{v} &= \nabla_{\mu} \boldsymbol{v} = \left(\frac{\partial v^{\sigma}}{\partial x^{\mu}} + v^{\nu} \Gamma^{\sigma}{}_{\mu \nu}\right) \boldsymbol{e_{\sigma}}, \\ \nabla_{\mu} v^{\sigma} = v^{\sigma}{}_{;\mu} &= \frac{\partial v^{\sigma}}{\partial x^{\mu}} + v^{\nu} \Gamma^{\sigma}{}_{\mu \nu}, \end{align*}

and setting covariant derivative equal to zero still means it is parallel transported. However the following definition of the Christoffel symbols does not work, because we had to consider the extrinsic space for the dot product:

Γμνλ=eμxνeσgσλ.\Gamma^{\lambda}_{\mu \nu} = \frac{\partial \boldsymbol{e_{\mu}}}{\partial x^{\nu}} \cdot \boldsymbol{e_{\sigma}} g^{\sigma \lambda}.

We also had to consider the extrinsic geometry for the metric tensor. There is however nothing we can do about this and the metric tensor has to be defined, obtained from an equation or obtained using from an extrinsic space. We can think of the case of the Cartesian metric tensor - the metric tensor is defined for us and there is no way to derive it.

When working with intrinsic geometry, we also cannot use the extrinsic position vector R\boldsymbol{R}, so instead of writing the basis vectors like this:

eμ=Rxμ,\boldsymbol{e_{\mu}} = \frac{\partial \boldsymbol{R}}{\partial x^{\mu}},

we write it like this:

eμ=xμ.\boldsymbol{e_{\mu}} = \frac{\partial}{\partial x^{\mu}}.

In extrinsic geometry, we started by defining the covariant derivative as the derivative with the normal components subtracted:

ddλv=dvdλn.\nabla_{\frac{d}{d\lambda}} \boldsymbol{v} = \frac{d\boldsymbol{v}}{d\lambda} - \boldsymbol{n}.

But when considering the intrinsic geometry, we do not have a normal components

ddλv=dvdλ.\nabla_{\frac{d}{d\lambda}} \boldsymbol{v} = \frac{d\boldsymbol{v}}{d\lambda}.

Consider the covariant derivative in the direction of xμx^{\mu} coordinate:

xμv=vxμ=xμ(vνeν)=vνxμeν+vνeνxμ=vνxμeν+vν(Γμνσeσ+Lμνn^)=vσxμeσ+vνΓμνσeσ+vνLμνn^=(vσxμ+vνΓμνσ)eσ+vνLμνn^, \begin{align*} \nabla_{\frac{\partial}{\partial x^{\mu}}} \boldsymbol{v} &= \frac{\partial \boldsymbol{v}}{\partial x^{\mu}} \\ &= \frac{\partial}{\partial x^{\mu}} \left(v^{\nu} \boldsymbol{e_{\nu}}\right) \\ &= \frac{\partial v^{\nu}}{\partial x^{\mu}} \boldsymbol{e_{\nu}} + v^{\nu} \frac{\partial \boldsymbol{e_{\nu}}}{\partial x^{\mu}} \\ &= \frac{\partial v^{\nu}}{\partial x^{\mu}} \boldsymbol{e_{\nu}} + v^{\nu} \left(\Gamma^{\sigma}_{\mu \nu} \boldsymbol{e_{\sigma}} + L_{\mu \nu} \boldsymbol{\hat{n}}\right) \\ &= \frac{\partial v^{\sigma}}{\partial x^{\mu}} \boldsymbol{e_{\sigma}} + v^{\nu} \Gamma^{\sigma}_{\mu \nu} \boldsymbol{e_{\sigma}} + v^{\nu} L_{\mu \nu} \boldsymbol{\hat{n}} \\ &= \left(\frac{\partial v^{\sigma}}{\partial x^{\mu}} + v^{\nu} \Gamma^{\sigma}_{\mu \nu}\right) \boldsymbol{e_{\sigma}} + v^{\nu} L_{\mu \nu} \boldsymbol{\hat{n}}, \end{align*}

and again, since the normal components don't exist, we can ignore them:

xμv=(vσxμ+vνΓμνσ)eσ,\nabla_{\frac{\partial}{\partial x^{\mu}}} \boldsymbol{v} = \left(\frac{\partial v^{\sigma}}{\partial x^{\mu}} + v^{\nu} \Gamma^{\sigma}_{\mu \nu}\right) \boldsymbol{e_{\sigma}},

or in coponent form:

μvσ=vσ;μ=vσxμ+vνΓμνσ,\nabla_{\mu} v^{\sigma} = v^{\sigma}{}_{;\mu} = \frac{\partial v^{\sigma}}{\partial x^{\mu}} + v^{\nu} \Gamma^{\sigma}_{\mu \nu},

and these equations also work in extrinsic geometry.

Recall the following properties:

gμν=gνμ,Γλμν=Γλνμ. \begin{align*} g_{\mu \nu} &= g_{\nu \mu}, \\ \Gamma^{\lambda}{}_{\mu \nu} &= \Gamma^{\lambda}{}_{\nu \mu}. \end{align*}

If we take derivative of metric tensor component with respect to a coordinate, we get:

gμν,σ=gμνxσ=xσ(eμeν)=eμxσeν+eμeνxσ=Γλμσeλeν+Γλνσeμeλ,gμν,σ=Γλμσgλν+Γλνσgμλ. \begin{align*} g_{\mu \nu, \sigma} = \frac{\partial g_{\mu \nu}}{\partial x^{\sigma}} &= \frac{\partial}{\partial x^{\sigma}} (\boldsymbol{e_{\mu}} \cdot \boldsymbol{e_{\nu}}) \\ &= \frac{\partial \boldsymbol{e_{\mu}}}{\partial x^{\sigma}} \cdot \boldsymbol{e_{\nu}} + \boldsymbol{e_{\mu}} \cdot \frac{\partial \boldsymbol{e_{\nu}}}{\partial x^{\sigma}} \\ &= \Gamma^{\lambda}{}_{\mu \sigma} \boldsymbol{e_{\lambda}} \cdot \boldsymbol{e_{\nu}} + \Gamma^{\lambda}{}_{\nu \sigma} \boldsymbol{e_{\mu}} \cdot \boldsymbol{e_{\lambda}}, \\ g_{\mu \nu, \sigma} &= \Gamma^{\lambda}{}_{\mu \sigma} g_{\lambda \nu} + \Gamma^{\lambda}{}_{\nu \sigma} g_{\mu \lambda}. \\ \end{align*}

If we swap ν\nu and σ\sigma, we get:

gμσ,ν=Γλμνgλσ+Γλσνgμλ=Γλμνgλσ+Γλνσgμλ, \begin{align*} g_{\mu \sigma, \nu} &= \Gamma^{\lambda}{}_{\mu \nu} g_{\lambda \sigma} + \Gamma^{\lambda}{}_{\sigma \nu} g_{\mu \lambda} \\ &= \Gamma^{\lambda}{}_{\mu \nu} g_{\lambda \sigma} + \Gamma^{\lambda}{}_{\nu \sigma} g_{\mu \lambda}, \end{align*}

and if we swap μ\mu and σ\sigma, we get:

gσν,μ=Γλσμgλν+Γλνμgσλ=Γλμσgλν+Γλμνgλσ. \begin{align*} g_{\sigma \nu, \mu} &= \Gamma^{\lambda}{}_{\sigma \mu} g_{\lambda \nu} + \Gamma^{\lambda}{}_{\nu \mu} g_{\sigma \lambda} \\ &= \Gamma^{\lambda}{}_{\mu \sigma} g_{\lambda \nu} + \Gamma^{\lambda}{}_{\mu \nu} g_{\lambda \sigma}. \end{align*}

There are some similar components, if we add gμν,σg_{\mu \nu, \sigma} and gμσ,νg_{\mu \sigma, \nu}:

gμν,σ+gμσ,ν=Γλμσgλν+Γλνσgμλ+Γλμνgλσ+Γλνσgμλ=Γλμσgλν+Γλμνgλσ+2Γλνσgμλ, \begin{align*} g_{\mu \nu, \sigma} + g_{\mu \sigma, \nu} &= \Gamma^{\lambda}{}_{\mu \sigma} g_{\lambda \nu} + \Gamma^{\lambda}{}_{\nu \sigma} g_{\mu \lambda} + \Gamma^{\lambda}{}_{\mu \nu} g_{\lambda \sigma} + \Gamma^{\lambda}{}_{\nu \sigma} g_{\mu \lambda} \\ &= \Gamma^{\lambda}{}_{\mu \sigma} g_{\lambda \nu} + \Gamma^{\lambda}{}_{\mu \nu} g_{\lambda \sigma} + 2\Gamma^{\lambda}{}_{\nu \sigma} g_{\mu \lambda}, \end{align*}

and subtract gσν,μg_{\sigma \nu, \mu} from this:

gμν,σ+gμσ,νgσν,μ=Γλμσgλν+Γλμνgλσ+2Γλνσgμλ(Γλμσgλν+Γλμνgλσ)=2Γλνσgμλ,Γλνσgμλ=12(gμν,σ+gμσ,νgσν,μ),Γλνσgμλgμρ=12(gμν,σ+gμσ,νgσν,μ),Γλνσδλρ=12gμρ(gμν,σ+gμσ,νgσν,μ),Γρνσ=12gμρ(gμν,σ+gμσ,νgσν,μ)=12gμρ(gμνxσ+gμσxνgσνxμ), \begin{align*} g_{\mu \nu, \sigma} + g_{\mu \sigma, \nu} - g_{\sigma \nu, \mu} &= \Gamma^{\lambda}{}_{\mu \sigma} g_{\lambda \nu} + \Gamma^{\lambda}{}_{\mu \nu} g_{\lambda \sigma} + 2\Gamma^{\lambda}{}_{\nu \sigma} g_{\mu \lambda} - ( \Gamma^{\lambda}{}_{\mu \sigma} g_{\lambda \nu} + \Gamma^{\lambda}{}_{\mu \nu} g_{\lambda \sigma}) \\ &= 2\Gamma^{\lambda}{}_{\nu \sigma} g_{\mu \lambda}, \\ \Gamma^{\lambda}{}_{\nu \sigma} g_{\mu \lambda} &= \frac{1}{2} (g_{\mu \nu, \sigma} + g_{\mu \sigma, \nu} - g_{\sigma \nu, \mu}), \\ \Gamma^{\lambda}{}_{\nu \sigma} g_{\mu \lambda} g^{\mu \rho} &= \frac{1}{2} (g_{\mu \nu, \sigma} + g_{\mu \sigma, \nu} - g_{\sigma \nu, \mu}), \\ \Gamma^{\lambda}{}_{\nu \sigma} \delta_{\lambda}^{\rho} &= \frac{1}{2} g^{\mu \rho} (g_{\mu \nu, \sigma} + g_{\mu \sigma, \nu} - g_{\sigma \nu, \mu}), \\ \Gamma^{\rho}{}_{\nu \sigma} &= \frac{1}{2} g^{\mu \rho} (g_{\mu \nu, \sigma} + g_{\mu \sigma, \nu} - g_{\sigma \nu, \mu}) \\ &= \frac{1}{2} g^{\mu \rho} \left(\frac{\partial g_{\mu \nu}}{\partial x^{\sigma}} + \frac{\partial g_{\mu \sigma}}{\partial x^{\nu}} - \frac{\partial g_{\sigma \nu}}{\partial x^{\mu}}\right), \end{align*}

we have arrived at the equation for Christoffel symbols in intrinsic geometry. This formula also works in extrinsic geometry.

If we take the covariant derivative of vector along itself, we get:

ddλddλ=ddλ(ddλ)=ddλ(dxμdλxμ)=d2xμdλ2eμ+dxμdλdeμdλ=d2xμdλ2eμ+dxμdλdxνdλeμxν=d2xσdλ2eσ+dxμdλdxνdλΓμνσeσ=(d2xσdλ2+dxμdλdxνdλΓμνσ)eσ \begin{align*} \nabla_{\frac{d}{d\lambda}} \frac{d}{d\lambda} &= \frac{d}{d\lambda} \left(\frac{d}{d\lambda}\right) \\ &= \frac{d}{d\lambda} \left(\frac{d x^{\mu}}{d \lambda} \frac{\partial}{\partial x^{\mu}}\right) \\ &= \frac{d^2 x^{\mu}}{d \lambda^2} \boldsymbol{e_{\mu}} + \frac{d x^{\mu}}{d \lambda} \frac{d \boldsymbol{e_{\mu}}}{d \lambda} \\ &= \frac{d^2 x^{\mu}}{d \lambda^2} \boldsymbol{e_{\mu}} + \frac{d x^{\mu}}{d \lambda} \frac{d x^{\nu}}{d \lambda} \frac{\partial \boldsymbol{e_{\mu}}}{\partial x^{\nu}} \\ &= \frac{d^2 x^{\sigma}}{d \lambda^2} \boldsymbol{e_{\sigma}} + \frac{d x^{\mu}}{d \lambda} \frac{d x^{\nu}}{d \lambda} \Gamma^{\sigma}_{\mu \nu} \boldsymbol{e_{\sigma}} \\ &= \left(\frac{d^2 x^{\sigma}}{d \lambda^2} + \frac{d x^{\mu}}{d \lambda} \frac{d x^{\nu}}{d \lambda} \Gamma^{\sigma}_{\mu \nu}\right) \boldsymbol{e_{\sigma}} \end{align*}

and in the geodesics chapter, we have already seen that this is the tangential acceleration. And if we set this equal to zero, we have the geodesic equations:

d2xσdλ2+dxμdλdxνdλΓμνσ=0.\frac{d^2 x^{\sigma}}{d \lambda^2} + \frac{d x^{\mu}}{d \lambda} \frac{d x^{\nu}}{d \lambda} \Gamma^{\sigma}_{\mu \nu} = 0.

Or, this can be rewritten for a vector v\boldsymbol{v}:

vv=0,\nabla_{\boldsymbol{v}} \boldsymbol{v} = \boldsymbol{0},

this is the parallel transport of v\boldsymbol{v} along itself - parallel transport of vector along itself is geodesic.

With the abstract definition, there may be different covariant derivative. The one we will care about is called the Levi-Civita connection. We will define the covariant derivative by observing the intrinsic definition and using its properties to create an abstract definition.

To save space, I will be using the following notation:

eμ=xμ=μ,\boldsymbol{e_{\mu}} = \frac{\partial}{\partial x^{\mu}} = \partial_{\mu},

in particular μ\partial_{\mu}.

The first property is addition and scaling in the direction vector:

au+bwv=auμμ+bwμμv=(auμ+bwμ)μv=(auμ+bwμ)μv=auμμv+bwμμv=auv+bwv, \begin{align*} \nabla_{a \boldsymbol{u} + b \boldsymbol{w}} \boldsymbol{v} &= \nabla_{a u^{\mu} \partial_{\mu} + b w^{\mu} \partial_{\mu}} \boldsymbol{v} \\ &= \nabla_{(a u^{\mu} + b w^{\mu}) \partial_{\mu}} \boldsymbol{v} \\ &= (a u^{\mu} + b w^{\mu}) \partial_{\mu} \boldsymbol{v} \\ &= a u^{\mu} \partial_{\mu} \boldsymbol{v} + b w^{\mu} \partial_{\mu} \boldsymbol{v} \\ &= a \nabla_{\boldsymbol{u}} \boldsymbol{v} + b \nabla_{\boldsymbol{w}} \boldsymbol{v}, \end{align*}

also called linearity.

The second property is the covariant derivative of sum of vectors:

u(v+w)=uμμ(v+w)=uμμ(v+w)=uμμv+uμμw=uv+wv. \begin{align*} \nabla_{\boldsymbol{u}} (\boldsymbol{v} + \boldsymbol{w}) &= \nabla_{u^{\mu} \partial_{\mu}} (\boldsymbol{v} + \boldsymbol{w}) \\ &= u^{\mu} \partial_{\mu} (\boldsymbol{v} + \boldsymbol{w}) \\ &= u^{\mu} \partial_{\mu} \boldsymbol{v} + u^{\mu} \partial_{\mu} \boldsymbol{w} \\ &= \nabla_{\boldsymbol{u}} \boldsymbol{v} + \nabla_{\boldsymbol{w}} \boldsymbol{v}. \end{align*}

The third property is the covariant derivative of a scaled vector:

u(av)=uμμ(av)=uμμ(av)=(uμμa)v+auμμv=(ua)v+a(uv). \begin{align*} \nabla_{\boldsymbol{u}} (a \boldsymbol{v}) &= \nabla_{u^{\mu} \partial_{\mu}} (a \boldsymbol{v}) \\ &= u^{\mu} \partial_{\mu} (a \boldsymbol{v}) \\ &= (u^{\mu} \partial_{\mu} a) \boldsymbol{v} + a u^{\mu} \partial_{\mu} \boldsymbol{v} \\ &= (\nabla_{\boldsymbol{u}} a) \boldsymbol{v} + a (\nabla_{\boldsymbol{u}} \boldsymbol{v}). \end{align*}

And the covariant derivative of scalar in a coordinate direction is just the partial derivative:

μa=axμ.\nabla_{\partial_{\mu}} a = \frac{\partial a}{\partial x^{\mu}}.

To put the properties together:

au+bwv=auv+bwv,u(v+w)=uv+wv,u(av)=(uv)a+a(uv),μa=axμ, \begin{align*} \nabla_{a \boldsymbol{u} + b \boldsymbol{w}} \boldsymbol{v} &= a \nabla_{\boldsymbol{u}} \boldsymbol{v} + b \nabla_{\boldsymbol{w}} \boldsymbol{v}, \\ \nabla_{\boldsymbol{u}} (\boldsymbol{v} + \boldsymbol{w}) &= \nabla_{\boldsymbol{u}} \boldsymbol{v} + \nabla_{\boldsymbol{w}} \boldsymbol{v}, \\ \nabla_{\boldsymbol{u}} (a \boldsymbol{v}) &= (\nabla_{\boldsymbol{u}} \boldsymbol{v}) a + a (\nabla_{\boldsymbol{u}} \boldsymbol{v}), \\ \nabla_{\partial_{\mu}} a &= \frac{\partial a}{\partial x^{\mu}}, \end{align*}

so a covariant derivative _ _\nabla_{\_}\ \_ is an operator with two inputs - input field and a direction vector. The output will be another field specifying the rate of change of the input field. The covariant derivative is also sometimes called connection because it provides a connection between two tangent spaces in a curved surface:

Tangent spaces on sphere

In intrinsic geometry, we have defined the partial derivative of basis vector with respect to a coordinate as follows:

eμxν=Γσμνeσ,\frac{\partial \boldsymbol{e_{\mu}}}{\partial x^{\nu}} = \Gamma^{\sigma}{}_{\mu \nu} \boldsymbol{e_{\sigma}},

and the abstract version of this would be:

eνeμ=Γσνμeσ,\nabla_{\boldsymbol{e_{\nu}}} \boldsymbol{e_{\mu}} = \Gamma^{\sigma}{}_{\nu \mu} \boldsymbol{e_{\sigma}},

where the Christoffel symbols are also sometimes called the connection coefficients. Note that the order in the lower indices of the Christoffel symbols does matter. So generally:

ΓσνμΓσμν.\Gamma^{\sigma}{}_{\nu \mu} \neq \Gamma^{\sigma}{}_{\mu \nu}.

Turns out that the following conditions are not enough to solve for unique connection coefficients:

au+bwv=auv+bwv,u(v+w)=uv+wv,u(av)=(uv)a+a(uv),μa=axμ,eνeμ=Γσνμeσ. \begin{align*} \nabla_{a \boldsymbol{u} + b \boldsymbol{w}} \boldsymbol{v} &= a \nabla_{\boldsymbol{u}} \boldsymbol{v} + b \nabla_{\boldsymbol{w}} \boldsymbol{v}, \\ \nabla_{\boldsymbol{u}} (\boldsymbol{v} + \boldsymbol{w}) &= \nabla_{\boldsymbol{u}} \boldsymbol{v} + \nabla_{\boldsymbol{w}} \boldsymbol{v}, \\ \nabla_{\boldsymbol{u}} (a \boldsymbol{v}) &= (\nabla_{\boldsymbol{u}} \boldsymbol{v}) a + a (\nabla_{\boldsymbol{u}} \boldsymbol{v}), \\ \nabla_{\partial_{\mu}} a &= \frac{\partial a}{\partial x^{\mu}}, \\ \nabla_{\boldsymbol{e_{\nu}}} \boldsymbol{e_{\mu}} &= \Gamma^{\sigma}{}_{\nu \mu} \boldsymbol{e}_{\sigma}. \end{align*}

To get the unique solution for the connection coefficients, we have to introduce two new properties. The first is the torsion-free property:

wvvw=[v,w],\nabla_{\boldsymbol{w}} \boldsymbol{v} - \nabla_{\boldsymbol{v}} \boldsymbol{w} = [\boldsymbol{v}, \boldsymbol{w}],

where [v,w][\boldsymbol{v}, \boldsymbol{w}] is the Lie bracket:

[v,w]=v(w)w(v),[\boldsymbol{v}, \boldsymbol{w}] = \boldsymbol{v} (\boldsymbol{w}) - \boldsymbol{w} (\boldsymbol{v}),

and for our specific case, where the vectors are partial derivative operators (and the order of partial differentiation does not matter):

[v,w]=vμμ(wνν)wμμ(vνν)=vμμwνν+vμwνμνwμμvννwμvνμν=vμμwννvμμwνν+vμwνμνwμvνμν=vμwνμνwμvνμν=0, \begin{align*} [\boldsymbol{v}, \boldsymbol{w}] &= v^{\mu} \partial_{\mu} (w^{\nu} \partial_{\nu}) - w^{\mu} \partial_{\mu} (v^{\nu} \partial_{\nu}) \\ &= v^{\mu} \partial_{\mu} w^{\nu} \partial_{\nu} + v^{\mu} w^{\nu} \partial_{\mu} \partial_{\nu} - w^{\mu} \partial_{\mu} v^{\nu} \partial_{\nu} - w^{\mu} v^{\nu} \partial_{\mu} \partial_{\nu} \\ &= v^{\mu} \partial_{\mu} w^{\nu} \partial_{\nu} - v^{\mu} \partial_{\mu} w^{\nu} \partial_{\nu} + v^{\mu} w^{\nu} \partial_{\mu} \partial_{\nu} - w^{\mu} v^{\nu} \partial_{\mu} \partial_{\nu} \\ &= v^{\mu} w^{\nu} \partial_{\mu} \partial_{\nu} - w^{\mu} v^{\nu} \partial_{\mu} \partial_{\nu} \\ &= 0, \end{align*}

so the torsion-free property is simplified:

wv=vw.\nabla_{\boldsymbol{w}} \boldsymbol{v} = \nabla_{\boldsymbol{v}} \boldsymbol{w}.

The torsion property implies the following for the connection coefficients:

eνeμ=Γσνμeσ,eμeν=Γσμνeσ,eνeμ=eμeν,Γσνμeσ=Γσμνeσ,Γσνμ=Γσμν. \begin{align*} \nabla_{\boldsymbol{e_{\nu}}} \boldsymbol{e_{\mu}} &= \Gamma^{\sigma}{}_{\nu \mu} \boldsymbol{e_{\sigma}}, \\ \nabla_{\boldsymbol{e_{\mu}}} \boldsymbol{e_{\nu}} &= \Gamma^{\sigma}{}_{\mu \nu} \boldsymbol{e_{\sigma}}, \\ \nabla_{\boldsymbol{e_{\nu}}} \boldsymbol{e_{\mu}} &= \nabla_{\boldsymbol{e_{\mu}}} \boldsymbol{e_{\nu}}, \\ \Gamma^{\sigma}{}_{\nu \mu} \boldsymbol{e_{\sigma}} &= \Gamma^{\sigma}{}_{\mu \nu} \boldsymbol{e_{\sigma}}, \\ \Gamma^{\sigma}{}_{\nu \mu} &= \Gamma^{\sigma}{}_{\mu \nu}. \end{align*}

The second property is the metric compatibility:

w(uv)=(wu)v+u(wv),\nabla_{\boldsymbol{w}} (\boldsymbol{u} \cdot \boldsymbol{v}) = (\nabla_{\boldsymbol{w}} \boldsymbol{u}) \cdot \boldsymbol{v} + \boldsymbol{u} \cdot (\nabla_{\boldsymbol{w}} \boldsymbol{v}),

this may be interpreted that when taking parallel transport of two vectors (wu=wv=0\nabla_{\boldsymbol{w}} \boldsymbol{u} = \nabla_{\boldsymbol{w}} \boldsymbol{v} = \boldsymbol{0}), their dot product stays the same:

w(uv)=0v+u0=0, \begin{align*} \nabla_{\boldsymbol{w}} (\boldsymbol{u} \cdot \boldsymbol{v}) &= \boldsymbol{0} \cdot \boldsymbol{v} + \boldsymbol{u} \cdot \boldsymbol{0} \\ &= \boldsymbol{0}, \end{align*}

implying that the angle between the angle between the vectors stays the same. If the two vectors are the same vector, it implies that the length of the vector stays constant:

w(v2)=w(vv)=0v+uv=0. \begin{align*} \nabla_{\boldsymbol{w}} (|\boldsymbol{v}|^2) = \nabla_{\boldsymbol{w}} (\boldsymbol{v} \cdot \boldsymbol{v}) &= \boldsymbol{0} \cdot \boldsymbol{v} + \boldsymbol{u} \cdot \boldsymbol{v} \\ &= \boldsymbol{0}. \end{align*}

If we apply metric compatibility on basis vectors, we obtain:

eσ(eμeν)=gμν,σ=(eσeμ)eν+eμ(eσeν)=Γλσμeλeν+Γλσνeλeμ,=Γλσμgλν+Γλσνgλμ \begin{align*} \nabla_{\boldsymbol{e_{\sigma}}} (\boldsymbol{e_{\mu}} \cdot \boldsymbol{e_{\nu}}) = g_{\mu \nu, \sigma} &= (\nabla_{\boldsymbol{e_{\sigma}}} \boldsymbol{e_{\mu}}) \cdot \boldsymbol{e_{\nu}} + \boldsymbol{e_{\mu}} \cdot (\nabla_{\boldsymbol{e_{\sigma}}} \boldsymbol{e_{\nu}}) \\ &= \Gamma^{\lambda}{}_{\sigma \mu} \boldsymbol{e_{\lambda}} \cdot \boldsymbol{e_{\nu}} + \Gamma^{\lambda}{}_{\sigma \nu} \boldsymbol{e_{\lambda}} \cdot \boldsymbol{e_{\mu}}, \\ &= \Gamma^{\lambda}{}_{\sigma \mu} g_{\lambda \nu} + \Gamma^{\lambda}{}_{\sigma \nu} g_{\lambda \mu} \\ \end{align*}

Similar to the intrinsic definition, we swap the indices:

gμν,σ=Γλμσgλν+Γλνσgλμ,gμσ,ν=Γλμνgλσ+Γλνσgλμ,gσν,μ=Γλμσgλν+Γλμνgλμ, \begin{align*} g_{\mu \nu, \sigma} &= \Gamma^{\lambda}{}_{\mu \sigma} g_{\lambda \nu} + \Gamma^{\lambda}{}_{\nu \sigma} g_{\lambda \mu}, \\ g_{\mu \sigma, \nu} &= \Gamma^{\lambda}{}_{\mu \nu} g_{\lambda \sigma} + \Gamma^{\lambda}{}_{\nu \sigma} g_{\lambda \mu}, \\ g_{\sigma \nu, \mu} &= \Gamma^{\lambda}{}_{\mu \sigma} g_{\lambda \nu} + \Gamma^{\lambda}{}_{\mu \nu} g_{\lambda \mu}, \\ \end{align*}

remember, we are allowed to swap indices in the connection coefficients because of the torsion free property. We can now derive the connection coefficients:

gμν,σ+gμσ,νgσν,μ=2Γλνσgλμ,12gμρ(gμν,σ+gμσ,νgσν,μ)=Γλνσgλμgμρ,Γρνσ=12gμρ(gμν,σ+gμσ,νgσν,μ),=12gμρ(gμνxσ+gμσxνgσνxμ), \begin{align*} g_{\mu \nu, \sigma} + g_{\mu \sigma, \nu} - g_{\sigma \nu, \mu} &= 2\Gamma^{\lambda}{}_{\nu \sigma} g_{\lambda \mu}, \\ \frac{1}{2} g^{\mu \rho} (g_{\mu \nu, \sigma} + g_{\mu \sigma, \nu} - g_{\sigma \nu, \mu}) &= \Gamma^{\lambda}{}_{\nu \sigma} g_{\lambda \mu} g^{\mu \rho}, \\ \Gamma^{\rho}{}_{\nu \sigma} &= \frac{1}{2} g^{\mu \rho} (g_{\mu \nu, \sigma} + g_{\mu \sigma, \nu} - g_{\sigma \nu, \mu}), \\ &= \frac{1}{2} g^{\mu \rho} \left(\frac{\partial g_{\mu \nu}}{\partial x^{\sigma}} + \frac{\partial g_{\mu \sigma}}{\partial x^{\nu}} - \frac{\partial g_{\sigma \nu}}{\partial x^{\mu}}\right), \end{align*}

and the covariant derivative that uses these particular connection coefficients is called the Levi-Civita connection.

The fundamental theorem of Riemann geometry states that on any Riemannian manifold, there is a unique connection that is torsion-free and has metric compatibility - Levi-Civita connection.

There are other connection coefficients. One may be the case where all the connection coefficients Γ~ρμσ=0\tilde{\Gamma}^{\rho}{}_{\mu \sigma} = 0. We can take vector v0\boldsymbol{v_0} and parallel transport it:

ddλv=0,ddλv=dvdλ=dxμdλxμ(vνeν)=dxμdλ(vνxμeν+vνeνxμ)=dxμdλvσxμeσ+vνdxμdλeνxμ=dvσdλeσ+vνdxμdλΓσνμeσ=(dvσdλ+vνdxμdλΓσνμ)eσ=0, \begin{align*} \nabla_{\frac{d}{d\lambda}} \boldsymbol{v} &= \boldsymbol{0}, \\ \nabla_{\frac{d}{d\lambda}} \boldsymbol{v} &= \frac{d\boldsymbol{v}}{d\lambda} \\ &= \frac{d x^{\mu}}{d \lambda} \frac{\partial}{\partial x^{\mu}} (v^{\nu} \boldsymbol{e_{\nu}}) \\ &= \frac{d x^{\mu}}{d \lambda} \left(\frac{\partial v^{\nu}}{\partial x^{\mu}} \boldsymbol{e_{\nu}} + v^{\nu} \frac{\partial \boldsymbol{e_{\nu}}}{\partial x^{\mu}}\right) \\ &= \frac{d x^{\mu}}{d \lambda} \frac{\partial v^{\sigma}}{\partial x^{\mu}} \boldsymbol{e_{\sigma}} + v^{\nu} \frac{d x^{\mu}}{d \lambda} \frac{\partial \boldsymbol{e_{\nu}}}{\partial x^{\mu}} \\ &= \frac{d v^{\sigma}}{d \lambda} \boldsymbol{e_{\sigma}} + v^{\nu} \frac{d x^{\mu}}{d \lambda} \Gamma^{\sigma}{}_{\nu \mu} \boldsymbol{e_{\sigma}} \\ &= \left(\frac{d v^{\sigma}}{d \lambda} + v^{\nu} \frac{d x^{\mu}}{d \lambda} \Gamma^{\sigma}{}_{\nu \mu}\right) \boldsymbol{e_{\sigma}} \\ &= \boldsymbol{0}, \end{align*}

and since the connection coefficients are zero:

dvσdλeσ=0,vσ=Cσ, \begin{align*} \frac{d v^{\sigma}}{d \lambda} \boldsymbol{e_{\sigma}} &= \boldsymbol{0}, \\ v^{\sigma} &= C^{\sigma}, \end{align*}

where CσC^{\sigma} are arbitrary constants, but since we know the initial conditions, we have a solution:

v=v0.\boldsymbol{v} = \boldsymbol{v_0}.

For example, for a curve parametrized by θ=π4\theta = \frac{\pi}{4} and ϕ=λ\phi = \lambda, the vector is parallely transformed as follows:

Parallel transport with zero coefficients

and there are many more connections.

For a covariant derivative of a vector along another vector is equal to:

uv=uμeμ(vνeν)=uμ((eμvν)eν+vν(eμeν))=uμ(vνxμeν+vνΓσμνeσ)=uμ(vσxμeσ+vνΓσμνeσ)=uμ(vσxμ+vνΓσμν)eσ. \begin{align*} \nabla_{\boldsymbol{u}} \boldsymbol{v} &= \nabla_{u^{\mu} \boldsymbol{e_{\mu}}} (v^{\nu} \boldsymbol{e_{\nu}}) \\ &= u^{\mu} \left((\nabla_{\boldsymbol{e_{\mu}}} v^{\nu}) \boldsymbol{e_{\nu}} + v^{\nu} (\nabla_{\boldsymbol{e_{\mu}}} \boldsymbol{e_{\nu}})\right) \\ &= u^{\mu} \left(\frac{v^{\nu}}{\partial x^{\mu}} \boldsymbol{e_{\nu}} + v^{\nu} \Gamma^{\sigma}{}_{\mu \nu} \boldsymbol{e_{\sigma}}\right) \\ &= u^{\mu} \left(\frac{v^{\sigma}}{\partial x^{\mu}} \boldsymbol{e_{\sigma}} + v^{\nu} \Gamma^{\sigma}{}_{\mu \nu} \boldsymbol{e_{\sigma}}\right) \\ &= u^{\mu} \left(\frac{v^{\sigma}}{\partial x^{\mu}} + v^{\nu} \Gamma^{\sigma}{}_{\mu \nu}\right) \boldsymbol{e_{\sigma}}. \end{align*}

So to summarize, the properties of the covariant derivative connection are as follows:

au+bwv=auv+bwv,u(v+w)=uv+wv,u(av)=(uv)a+a(uv),μa=axμ,eνeμ=Γσνμeσ. \begin{align*} \nabla_{a \boldsymbol{u} + b \boldsymbol{w}} \boldsymbol{v} &= a \nabla_{\boldsymbol{u}} \boldsymbol{v} + b \nabla_{\boldsymbol{w}} \boldsymbol{v}, \\ \nabla_{\boldsymbol{u}} (\boldsymbol{v} + \boldsymbol{w}) &= \nabla_{\boldsymbol{u}} \boldsymbol{v} + \nabla_{\boldsymbol{w}} \boldsymbol{v}, \\ \nabla_{\boldsymbol{u}} (a \boldsymbol{v}) &= (\nabla_{\boldsymbol{u}} \boldsymbol{v}) a + a (\nabla_{\boldsymbol{u}} \boldsymbol{v}), \\ \nabla_{\partial_{\mu}} a &= \frac{\partial a}{\partial x^{\mu}}, \\ \nabla_{\boldsymbol{e_{\nu}}} \boldsymbol{e_{\mu}} &= \Gamma^{\sigma}{}_{\nu \mu} \boldsymbol{e_{\sigma}}. \end{align*}

The Levi-Civita has two special properties - torsion-free and metric compatibility:

wv=vw,w(uv)=(wu)v+u(wv), \begin{align*} \nabla_{\boldsymbol{w}} \boldsymbol{v} &= \nabla_{\boldsymbol{v}} \boldsymbol{w}, \\ \nabla_{\boldsymbol{w}} (\boldsymbol{u} \cdot \boldsymbol{v}) &= (\nabla_{\boldsymbol{w}} \boldsymbol{u}) \cdot \boldsymbol{v} + \boldsymbol{u} \cdot (\nabla_{\boldsymbol{w}} \boldsymbol{v}), \end{align*}

and the Levi-Civita connection coefficients are equal to:

Γρνσ=12gμρ(gμν,σ+gμσ,νgσν,μ),=12gμρ(gμνxσ+gμσxνgσνxμ). \begin{align*} \Gamma^{\rho}{}_{\nu \sigma} &= \frac{1}{2} g^{\mu \rho} (g_{\mu \nu, \sigma} + g_{\mu \sigma, \nu} - g_{\sigma \nu, \mu}), \\ &= \frac{1}{2} g^{\mu \rho} \left(\frac{\partial g_{\mu \nu}}{\partial x^{\sigma}} + \frac{\partial g_{\mu \sigma}}{\partial x^{\nu}} - \frac{\partial g_{\sigma \nu}}{\partial x^{\mu}}\right). \end{align*}

And remember, the order of indices in the Christoffel symbols matters in connections without the torsion free property. So generally:

eνeμ=ΓσνμeσΓσμνeσ.\nabla_{\boldsymbol{e_{\nu}}} \boldsymbol{e_{\mu}} = \Gamma^{\sigma}{}_{\nu \mu} \boldsymbol{e_{\sigma}} \neq \Gamma^{\sigma}{}_{\mu \nu} \boldsymbol{e_{\sigma}}.

And for a covariant derivative of a vector along another vector is equal to:

uv=uμ(vσxμ+vνΓσμν)eσ,\nabla_{\boldsymbol{u}} \boldsymbol{v} = u^{\mu} \left(\frac{\partial v^{\sigma}}{\partial x^{\mu}} + v^{\nu} \Gamma^{\sigma}{}_{\mu \nu}\right) \boldsymbol{e_{\sigma}},

and for the covariant derivative along a basis vector:

eμv=(vσxμ+vνΓσμν)eσ.\nabla_{\boldsymbol{e_{\mu}}} \boldsymbol{v} = \left(\frac{\partial v^{\sigma}}{\partial x^{\mu}} + v^{\nu} \Gamma^{\sigma}{}_{\mu \nu}\right) \boldsymbol{e_{\sigma}}.

Recall the following property of covectors:

ϵμ(eν)=δνμ,\epsilon^{\mu} (\boldsymbol{e_{\nu}}) = \delta^{\mu}_{\nu},

or when expressed as differential:

dxμ(xν)=xμxν=δνμ.dx^{\mu} \left(\frac{\partial}{\partial x^{\nu}}\right) = \frac{\partial x^{\mu}}{\partial x^{\nu}} = \delta^{\mu}_{\nu}.

Taking the covariant derivative of a covector aa:

μ(α)=μ(ανϵν)=(μαν)ϵν+αν(μϵν)=ασxμϵσ+ανΛνμσϵσ=(ασxμ+ανΛνμσ)ϵσ, \begin{align*} \nabla_{\partial_{\mu}} (\alpha) &= \nabla_{\partial_{\mu}} (\alpha_{\nu} \epsilon^{\nu}) \\ &= (\nabla_{\partial_{\mu}} \alpha_{\nu}) \epsilon^{\nu} + \alpha_{\nu} (\nabla_{\partial_{\mu}} \epsilon^{\nu}) \\ &= \frac{\partial \alpha_{\sigma}}{\partial x^{\mu}} \epsilon^{\sigma} + \alpha_{\nu} \Lambda^{\nu}{}_{\mu \sigma} \epsilon^{\sigma} \\ &= \left(\frac{\partial \alpha_{\sigma}}{\partial x^{\mu}} + \alpha_{\nu} \Lambda^{\nu}{}_{\mu \sigma}\right) \epsilon^{\sigma}, \end{align*}

where:

μϵν=Λνμσϵσ.\nabla_{\partial_{\mu}} \epsilon^{\nu} = \Lambda^{\nu}{}_{\mu \sigma} \epsilon^{\sigma}.

Consider the covariant of the covector α\alpha acting on vector v\boldsymbol{v} and remember that covector acting on a vector is a dot product:

μ(α(v))=μ(av),\nabla_{\partial_{\mu}} (\alpha(\boldsymbol{v})) = \nabla_{\partial_{\mu}} (\boldsymbol{a} \cdot \boldsymbol{v}),

using the metric compatibility:

μ(av)=(μa)v+a(μv),μ(α(v))=(μα)(v)+α(μv),μ(ασϵσ(vνeν))=(ασxμ+ανΛνμσ)ϵσ(v)+α((vσxμ+vνΓσμν)eσ),μ(ασvνϵσ(eν))=(ασxμ+ανΛνμσ)vσ+(vσxμ+vνΓσμν)α(eσ),μ(ασvνδνσ)=ασxμvσ+ανΛνμσvσ+vσxμασ+vνΓσμνασ,μ(ανvν)=ανxμvν+ανvνxμ+ανΛνμσvσ+vνΓσμνασ=μ(ανvν)+ανΛνμσvσ+vνΓσμνασ,0=ανvσΛνμσ+ασvνΓσμν,ασvνΛσμν=ασvνΓσμν,Λσμν=Γσμν, \begin{align*} \nabla_{\partial_{\mu}} (\boldsymbol{a} \cdot \boldsymbol{v}) &= (\nabla_{\partial_{\mu}} \boldsymbol{a}) \cdot \boldsymbol{v} + \boldsymbol{a} \cdot (\nabla_{\partial_{\mu}} \boldsymbol{v}), \\ \nabla_{\partial_{\mu}} (\alpha(\boldsymbol{v})) &= (\nabla_{\partial_{\mu}} \alpha) (\boldsymbol{v}) + \alpha(\nabla_{\partial_{\mu}} \boldsymbol{v}), \\ \nabla_{\partial_{\mu}} (\alpha_{\sigma} \epsilon^{\sigma} (v^{\nu} \boldsymbol{e_{\nu}})) &= \left(\frac{\partial \alpha_{\sigma}}{\partial x^{\mu}} + \alpha_{\nu} \Lambda^{\nu}{}_{\mu \sigma}\right) \epsilon^{\sigma} (\boldsymbol{v}) + \alpha \left(\left(\frac{\partial v^{\sigma}}{\partial x^{\mu}} + v^{\nu} \Gamma^{\sigma}{}_{\mu \nu}\right) \boldsymbol{e_{\sigma}}\right), \\ \nabla_{\partial_{\mu}} (\alpha_{\sigma} v^{\nu} \epsilon^{\sigma} (\boldsymbol{e_{\nu}})) &= \left(\frac{\partial \alpha_{\sigma}}{\partial x^{\mu}} + \alpha_{\nu} \Lambda^{\nu}{}_{\mu \sigma}\right) v^{\sigma} + \left(\frac{\partial v^{\sigma}}{\partial x^{\mu}} + v^{\nu} \Gamma^{\sigma}{}_{\mu \nu}\right) \alpha(\boldsymbol{e_{\sigma}}), \\ \nabla_{\partial_{\mu}} (\alpha_{\sigma} v^{\nu} \delta^{\sigma}_{\nu}) &= \frac{\partial \alpha_{\sigma}}{\partial x^{\mu}} v^{\sigma} + \alpha_{\nu} \Lambda^{\nu}{}_{\mu \sigma} v^{\sigma} + \frac{\partial v^{\sigma}}{\partial x^{\mu}} \alpha_{\sigma} + v^{\nu} \Gamma^{\sigma}{}_{\mu \nu} \alpha_{\sigma}, \\ \nabla_{\partial_{\mu}} (\alpha_{\nu} v^{\nu}) &= \frac{\partial \alpha_{\nu}}{\partial x^{\mu}} v^{\nu} + \alpha_{\nu} \frac{\partial v^{\nu}}{\partial x^{\mu}} + \alpha_{\nu} \Lambda^{\nu}{}_{\mu \sigma} v^{\sigma} + v^{\nu} \Gamma^{\sigma}{}_{\mu \nu} \alpha_{\sigma} \\ &= \nabla_{\partial_{\mu}} (\alpha_{\nu} v^{\nu}) + \alpha_{\nu} \Lambda^{\nu}{}_{\mu \sigma} v^{\sigma} + v^{\nu} \Gamma^{\sigma}{}_{\mu \nu} \alpha_{\sigma}, \\ 0 &= \alpha_{\nu} v^{\sigma} \Lambda^{\nu}{}_{\mu \sigma} + \alpha_{\sigma} v^{\nu} \Gamma^{\sigma}{}_{\mu \nu}, \\ \alpha_{\sigma} v^{\nu} \Lambda^{\sigma}{}_{\mu \nu} &= - \alpha_{\sigma} v^{\nu} \Gamma^{\sigma}{}_{\mu \nu}, \\ \Lambda^{\sigma}{}_{\mu \nu} &= - \Gamma^{\sigma}{}_{\mu \nu}, \end{align*}

substituting into the equation for covariant derivative:

μ(α)=(ασxμανΓνμσ)ϵσ,\nabla_{\partial_{\mu}} (\alpha) = \left(\frac{\partial \alpha_{\sigma}}{\partial x^{\mu}} - \alpha_{\nu} \Gamma^{\nu}{}_{\mu \sigma}\right) \epsilon^{\sigma},

and for the basis covector:

μϵν=Γνμσϵσ.\nabla_{\partial_{\mu}} \epsilon^{\nu} = -\Gamma^{\nu}{}_{\mu \sigma} \epsilon^{\sigma}.

To take the covariant derivative of a tensor, we have to declare the following property:

w(TS)=(wT)S+T(wS).\nabla_{\boldsymbol{w}} (T \otimes S) = (\nabla_{\boldsymbol{w}} T) \otimes S + T \otimes (\nabla_{\boldsymbol{w}} S).

Consider the covariant derivative of the metric tensor:

μ(g)=μ(gσνϵσϵν)=μ(gσν)(ϵσϵν)+gσνμ(ϵσϵν)=gσν,μ(ϵσϵν)+gσν((μϵσ)ϵν+ϵσ(μϵν))=gσν,μ(ϵσϵν)+gσν(ΓσμλϵλϵνϵσΓνμλϵλ)=gσν,μ(ϵσϵν)+gσν(Γσμλ(ϵλϵν)Γνμλ(ϵσϵλ))=gσν,μ(ϵσϵν)gσνΓσμλ(ϵλϵν)gσνΓνμλ(ϵσϵλ)=gσν,μ(ϵσϵν)gλνΓλμσ(ϵσϵν)gσλΓλμν(ϵσϵν)=(gσν,μgλνΓλμσgσλΓλμν)(ϵσϵν). \begin{align*} \nabla_{\partial_{\mu}} (g) &= \nabla_{\partial_{\mu}} (g_{\sigma \nu} \epsilon^{\sigma} \otimes \epsilon^{\nu}) \\ &= \nabla_{\partial_{\mu}} (g_{\sigma \nu}) (\epsilon^{\sigma} \otimes \epsilon^{\nu}) + g_{\sigma \nu} \nabla_{\partial_{\mu}} (\epsilon^{\sigma} \otimes \epsilon^{\nu}) \\ &= g_{\sigma \nu, \mu} (\epsilon^{\sigma} \otimes \epsilon^{\nu}) + g_{\sigma \nu} \left((\nabla_{\partial_{\mu}} \epsilon^{\sigma}) \otimes \epsilon^{\nu} + \epsilon^{\sigma} \otimes (\nabla_{\partial_{\mu}} \epsilon^{\nu})\right) \\ &= g_{\sigma \nu, \mu} (\epsilon^{\sigma} \otimes \epsilon^{\nu}) + g_{\sigma \nu} \left(- \Gamma^{\sigma}{}_{\mu \lambda} \epsilon^{\lambda} \otimes \epsilon^{\nu} - \epsilon^{\sigma} \otimes \Gamma^{\nu}{}_{\mu \lambda} \epsilon^{\lambda}\right) \\ &= g_{\sigma \nu, \mu} (\epsilon^{\sigma} \otimes \epsilon^{\nu}) + g_{\sigma \nu} \left(- \Gamma^{\sigma}{}_{\mu \lambda} (\epsilon^{\lambda} \otimes \epsilon^{\nu}) - \Gamma^{\nu}{}_{\mu \lambda} (\epsilon^{\sigma} \otimes \epsilon^{\lambda})\right) \\ &= g_{\sigma \nu, \mu} (\epsilon^{\sigma} \otimes \epsilon^{\nu}) - g_{\sigma \nu} \Gamma^{\sigma}{}_{\mu \lambda} (\epsilon^{\lambda} \otimes \epsilon^{\nu}) - g_{\sigma \nu} \Gamma^{\nu}{}_{\mu \lambda} (\epsilon^{\sigma} \otimes \epsilon^{\lambda}) \\ &= g_{\sigma \nu, \mu} (\epsilon^{\sigma} \otimes \epsilon^{\nu}) - g_{\lambda \nu} \Gamma^{\lambda}{}_{\mu \sigma} (\epsilon^{\sigma} \otimes \epsilon^{\nu}) - g_{\sigma \lambda} \Gamma^{\lambda}{}_{\mu \nu} (\epsilon^{\sigma} \otimes \epsilon^{\nu}) \\ &= \left(g_{\sigma \nu, \mu} - g_{\lambda \nu} \Gamma^{\lambda}{}_{\mu \sigma} - g_{\sigma \lambda} \Gamma^{\lambda}{}_{\mu \nu}\right) (\epsilon^{\sigma} \otimes \epsilon^{\nu}). \end{align*}

We can see a pattern in the connection coefficients. For an (m,n)(m, n)-tensor, there will be mm positive connection coefficient terms and nn negative connection coefficient terms.

Recall the metric compatibility:

gμν,σ=Γλσμgλν+Γλσνgλμ,0=gμν,σΓλσμgλνΓλσνgλμ,0=gσν,μΓλμσgλνΓλμνgλσ,0=gσν,μgλνΓλμσgσλΓλμν, \begin{align*} g_{\mu \nu, \sigma} &= \Gamma^{\lambda}{}_{\sigma \mu} g_{\lambda \nu} + \Gamma^{\lambda}{}_{\sigma \nu} g_{\lambda \mu}, \\ 0 &= g_{\mu \nu, \sigma} - \Gamma^{\lambda}{}_{\sigma \mu} g_{\lambda \nu} - \Gamma^{\lambda}{}_{\sigma \nu} g_{\lambda \mu}, \\ 0 &= g_{\sigma \nu, \mu} - \Gamma^{\lambda}{}_{\mu \sigma} g_{\lambda \nu} - \Gamma^{\lambda}{}_{\mu \nu} g_{\lambda \sigma}, \\ 0 &= g_{\sigma \nu, \mu} - g_{\lambda \nu} \Gamma^{\lambda}{}_{\mu \sigma} - g_{\sigma \lambda} \Gamma^{\lambda}{}_{\mu \nu}, \\ \end{align*}

the right hand side is exactly the same as the covariant derivative of the metric tensor:

μ(g)=(0)(ϵσϵν)=0.\nabla_{\partial_{\mu}} (g) = (0) (\epsilon^{\sigma} \otimes \epsilon^{\nu}) = 0.

So another way to write metric compatibility is that the covariant derivative of the metric tensor in any direction is zero:

μ(g)=0.\nabla_{\partial_{\mu}} (g) = 0.