Deriving Einstein’s Gravity Equations From Thermodynamics
Is Gravity Just an Average of the Behavior of Unknown “Atoms” of Spacetime?
Emergent gravity is an idea in quantum gravity according to which the fabric of spacetime is not fundamental but emerges as a coarse-graining approximation of underlying (still unknown) microscopic degrees of freedom (similarly to a gas emerging from a large sampling of atoms or molecules). In the words of Huggett and Wuthrich, emergent gravity is the view that gravity arises due to the “collective action of the dynamics of more fundamental non-gravitational degrees of freedom.”
In the present article, we will investigate a 1995 proposal by the American theoretical physicist Ted Jacobson that Einstein’s gravity equations can be derived from thermodynamics, “the branch of physics that deals with the relations between heat and other forms of energy” (see link). This implies that Einstein’s equations can be viewed as an equation of state, a thermodynamic equation relating variables describing the state of matter (such as, for example, the ideal gas law).
Motivation: Entropy and Horizons in Spacetime
In the 70s, the Mexican-born Israeli-American theoretical physicist Jacob Bekenstein and the English theoretical physicist and cosmologist Stephen Hawking showed that black holes have a thermodynamic entropy proportional to the area of their event horizon.
where G, c, h, and k denote Newton’s gravity constant, the speed of light, Planck’s constant, and the Boltzmann’s constant. Note that just by examining the constants in this expression, we infer that black holes lie at the intersection of gravity, quantum mechanics, and thermodynamics since:
- G is the gravitational constant, needed to calculate the gravitational effects in Newton’s law of universal gravitation and in Einstein’s theory of gravity.
- h is the Planck's constant, the quantum of electromagnetic action relating a photon's energy to its frequency (photons are particles of light).
- k is the Boltzmann constant, the proportionality factor relating the average kinetic energy of particles in a gas with its thermodynamic temperature.
Black holes are not the only spacetime configurations carrying entropy. Two other important examples are:
- Cosmological horizons in de Sitter (dS) space
- Observer-dependent horizons in Rindler spacetime (a coordinate system representing part of Minkowski spacetime that describes uniformly accelerated observers). This type of horizon, which will be central here, has entropy and a temperature (the Unruh temperature). The latter is proportional to the observers' acceleration and its existence hints thatspacetime itself encodes thermodynamical information.
Quoting the renowned American theoretical physicist Robert Wald, who made fundamental contributions to the study of gravitation physics, such as the discovery of the general formula for black hole entropy, and the development of a rigorous formulation of quantum field theory in curved spacetime:
“I believe that the relationship between black holes and thermodynamics provides us with the deepest insights that we currenly have concerning the nature of gravitation, thermodynamics, and quantum physics.”
— Robert Wald
We conclude that, since we can associate a temperature and entropy to regions of spacetime, it is not unreasonable to suppose that their properties may have some similarities to the properties of matter at the macroscopic scale. Quoting Jacobson:
“This perspective suggests that it may be no more appropriate to […] quantize the Einstein equation than it would be to quantize the wave equation for sound in air.”
Ted Jacobson and Einstein’s Equation of State
As described in the introduction, in a 1995 article, Ted Jacobson showed that Einstein’s equations can be obtained by applying thermodynamics laws to the so-called Rindler horizons. His proposal implies that spacetime is emergent.
This idea will be described in detail below, but to fully understand it we first need to review some preliminary concepts from Einstein’s relativity. One of my past articles may be useful as a revision of this material.
It should be noted that if we derive the Einstein equations using arguments from thermodynamics, we cannot interpret the equations geometrically (as it is usually done). If we assume gravity is, by nature, a thermodynamic phenomenon, the Einstein equations must be interpreted using thermodynamic concepts. In other words, gravitational dynamics must be re-expressed in terms of the thermalevolution of spacetime (see this reference).
Preliminary Concepts
Vectors and Dual Vectors
We first need to define what are vectors and dual vectors. A manifold can be loosely defined as a space that resembles Euclidean (flat) space near each of its points.
Consider now a curve γ on the manifold parameterized by λ (see Fig. 5). It can be described by the parametric equations:
Fig. 6, shows an example of a curve along the surface of a cylinder. It is given by the following parametric equations:
Now consider a function f defined along the curve γ on M. How does it vary along the curve in terms of the parameter λ?
where we can identify the components of the tangent vector and the gradient of the function f. The gradient is referred to as a dual vector. Using the “,” notation for partial derivatives we write:
Note that in Eq. 4 the Einstein’s summation convention was used to omit the summation symbol.
The figure below shows a tangent vector u to the curve γ parametrized by λ:
Vector and dual vectors transform differently under coordinate transformations:
Tensors
We will now consider tensors (see Dirac, for a simple explanation). Vectors are tensors of the type (1, 0). Dual vectors are tensors of the type (0, 1). Following Dirac, to obtain general tensors of higher ranks, we first build the quantity
which is a particular kind of tensor of type (2, 0). Adding several tensors like T, one gets a general tensor of type (2, 0):
Under a coordinate transformation, T transforms as:
If we have 2 lower (instead of upper) indexes, T is said to be of the type (0, 2). Tensors can also have mixed indexes such as the (1, 1) tensor below:
Lie derivatives
The Lie derivative is a concept from differential geometry, a mathematical discipline that applies calculus, linear algebra, and multilinear algebra to geometry problems. This type of differentiation, named after the Norwegian mathematician Sophus Lie, evaluates the change of a tensor field along the flowof another vector field (see Wiki).
Suppose we have a vector field A in a region of spacetime and a curve γ in the neighborhood of which A is defined. The tangent vector to γ is u = dx/dλ. We consider two points in the curve, x and x+dx.
Under the infinitesimal change dx
the vector A transforms in the following manner:
Now, the value of the original vector field in x+dx can be written as:
The Lie derivative of A along the curve γ is defined by:
The tangent vector u is the direction in which we carry the Lie derivative (see this video for a detailed explanation of the construction of Lie derivatives).
We can better understand the Lie derivative with the following construction. Any vector field is defined by the congruence of curves for which it is the tangent field. We then draw a curve tangent to A at P (the form of the rest of the curve can be anything). We then parametrize the first curve by λ. We choose λ=1 at Pand use the cross curve to fix the parameterization of the other curves to be λ=1. The rate at which λ changes at each curve is fixed by its tangent vectors. We then slide the cross curve by dλ on all curves of the congruence. Note that the cross curve through Q is rigidly tied to the curve through P. There is a vector A’ tangent to the new curve at Q. Since we have two vectors at Q we can subtract them. The Lie derivative is then:
The construction is illustrated below:
The Lie derivative is, contrary to its appearance, a tensorial expression since we can rewrite it as:
Dual-vectors transform as:
Now consider a vector A which does not depend on some coordinate, say, x⁰ in some specific coordinate system:
The * indicates that the corresponding equality is valid in one specific coordinate system. Hence we have a set of curves along which x⁰ increases where A does not change. In this coordinate system:
From Eq. 19 we obtain:
We can rewrite Eq. 18 as:
These equations imply that:
This equation expresses the invariance of A in the direction of U. But the Lie derivative on the left-hand side is independent of coordinate systems. Therefore, since it is zero in one coordinate system, it is zero in all coordinate systems.
The Lie derivative of a type-(0,2) tensor is:
Consider some coordinate system the tensor A is independent of some coordinate x⁰. These can be expressed in two different ways:
- To say this in terms of coordinates is Eq. 18
- To say this covariantly is to say that the Lie derivative is zero when the vector field U is aligned with the coordinates where x⁰ is running.
Eq. 22 expresses the invariance of the tensor A in the direction of the vector U.
This notion becomes very important if we have a symmetry in our spacetime. For example, if vectors don’t depend on time x⁰ or if vectors don’t depend on rotations along some axis, the way to express this in a covariant way is to say that the Lie derivative of that tensor is going to be zero along the appropriate direction, either at time translation or a rotation along an axis. The vector along which you have the symmetry becomes the so-called Killing vector.
Killing Vectors and Symmetries
A Killing vector is a vector field ξ such that the Lie derivative of the metric along ξ is zero:
If in a given coordinate system the metric does not depend on the coordinate σ*, the α-component of ξ is:
We can also write ξ and its α-component as:
Eq. 23 says that we have a symmetry of the metric in the direction along which you know ξ points. Symmetries of the metric are called isometries.
Using the definition of Lie derivative we obtain:
We can use also Killing vectors to obtain constants of motion along geodesics. If u is tangent to a geodesic it is trivial to show that for ξ obeying Eq. 23, the following result follows:
Isometries give origin to conserved quantities along geodesics. More precisely, Killing vectors ξ generate isometries, and transformations under which g is invariant are expressed infinitesimally as motions in the direction of ξ.
For clarity, let us now consider some simple examples of Killing vectors and their corresponding symmetries.
Example 1
Take for example the following metric in R³:
Note that the metric does not depend on x, y, or z. Therefore, the following three vectors are Killing vectors corresponding to translations:
There are other symmetries in R³. Writing Eq. 26 in spherical coordinates, (illustrated below), the metric becomes:
Since the components of g are ϕ-independent,
is another Killing vector of R³. In cartesian coordinates this becomes:
Rotations around the other two axes give us two other Killing vectors.
Example 2
Consider now a spherically symmetric spacetime (such as the Schwarzchild metric):
Since the metric g has no dependence on t and no dependence on ϕ, time translations, and rotations around the z-axis are examples of isometries. We have two obvious Killing vectors:
The following quantities (related to energy and angular momentum per unit mass) are constant along geodesics to which a vector u is tangent:
Killing Horizons
Let us consider a null hypersurface (for example, a light cone). It is, by definition, a hypersurface whose normal vector at every point is null (it has zero length with respect to the local metric tensor g). A Killing horizon Σ is a null hypersurface where the norm of a Killing vector field vanishes. Also, since a null surface cannot have two linearly independent null tangent vectors, ξ will be normal to Σ.
In a Minkowski spacetime in inertial coordinates, the timelike Killing vector that generates boosts, for example, in the x-direction is given by:
Its norm is given by:
When ξ is constant its orbits are hyperboles representing worldlines of uniformly accelerated observers with proper acceleration a = 1/ξ. As ξ → 0 the acceleration increases and the boost Killing vector field generates a bifurcate Killing horizon,
the so-called Rindler horizon.
These null surfaces are therefore Killing horizons. Since a Killing vector ξ is normal to its Killing horizon Σ, along Σ it obeys the geodesic equation (the κ on the right-hand side of the geodesic equation accounts for the possibility that the integral curves of ξ are not affinely parametrized):
where κ is called surface gravity. For a static spacetime, κ is the acceleration of a static observer close to the horizon, measured by a static observer at ∞.
The Rindler Wedge
Consider an arbitrary spacetime point p. Locally, the spacetime around p is flat (because of the principle of equivalence). Now choose a small patch B of a spacelike 2-surface passing containing P and introduce Riemann normal coordinates (RNC). The metric in RNC is given by:
The coordinates of the points on the patch B are:
Adding past and future light sheets, say, in the z-direction we obtain:
Eq. 40 describes a local Rindler wedge (two null three-dimensional half-planes joined by the spacelike bifurcation, the 2-plane B at t=0) illustrated below.
Consider now a sheet of hyperbolic timelike observers close to the null surface. Their coordinates, velocity, and acceleration
where a is the observer acceleration
describe approximately uniformly accelerated observers. For such observers, the light sheets of the Rindler wedge form the Rindler horizon. A causal horizon is the boundary of the spatial region consisting of points causally connected to an observer. More precisely, a causal horizon is a hypersurface that is a boundary between light rays that are directed outwards and moving outwards (dz/dt>0), and those directed outward but moving inward (dz/dt<0).
A well-known result from quantum field theory in curved spacetime is that for observers moving with uniform proper acceleration, the Minkowski vacuum state appears as a thermal bath of particles at the so-called Unruh temperature.
and Hamiltonian is the generator of the boost symmetry of Minkowski spacetime around point p.
Deriving The Equation of State
Let us now consider in detail the geometrical construction which will be needed to follow Jacobson’s derivation. Note that, following Svesko, the following construction is not formulated over the past horizon (as in Jacobson’s paper) but over the future horizon. The results, nevertheless, are unchanged. Also, from now on, we will keep only the constants G and h from the Bekenstein-Hawking entropy (and set the others to 1).
Geometrical Construction
The full construction is in Fig. 16. Consider a point p in a spacetime M and an infinitesimal neighborhood containing p. For a small, locally flat, neighborhood, we can define a spacelike foliation parametrized by a time coordinate t (by invoking the equivalence principle).
The point p is located at time t=t₁ on a spacelike codimension-one hypersurface Σ₁(codimension-one means that since M has dimension d=4, the submanifold Σ₁ has dimension d=4–1=3). Now introduce a spacelike codimension-two approximately-flat patch P₁ which contains the point p (a submanifold with d=2) and construct a local inertial frame (using Riemann normal coordinates) inside P₁.
The choice of approximate flatness implies that the null congruences that are normal to the patch P₁ have expansion θ ≈ 0 and shear tensor σ≈ 0 inside P₁. This is necessary since, in this analysis, the system is chosen to be in thermodynamic equilibrium.
We now introduce a closed spacelike codimension-two surface B₁ (a submanifold with d=2) such that P₁⊂ B₁ and then choose a future-inward null direction normal to B₁ and choose a null congruence whose origin is at B₁. The spacelike region inside B₁ is denoted by R₁. Choosing an affine parameter λ along the congruence and a tangent vector:
The congruence generates a null hypersurface H that emanates from P₁ with k tangent to the generators. At t=t₂, H intersects another spacelike hypersurface Σ₂ at a spacelike codimension-two surface B₂ containing another spacelike patch P₂. Hence, P₁ evolves into P₂. We denote the region inside B₂ by R₂.
Note that the H is the local Rindler horizon for the accelerating observers in Fig. 14. Therefore, we can choose an approximate boost Killing vector field ξ to be the generator of H and define it (indirectly) as:
for an affine parameter λ where κ is given by Eq. 44 (κ is the acceleration of the orbit of the associated Killing vector ξ). The surface element for the local Rindler horizon is:
where dA is the codimension-two spacelike cross-sectional area element.
We then define the energy that flows across the horizon assume it is all heat. The heat is interpreted as the energy flow into macroscopically unobservable degrees of freedom. The leaving heat flux is:
By construction, our thermodynamic system corresponds to the degrees of freedom beyond the local Rindler horizon (since the Rindler observers are out of causal contact with the region behind the light sheet they associate the energy to heat), on the region R₁. The (negative) variation of the area of H as δQ crosses it is δA = A₁-A₂, where A₁ and A₂ are the initial and final areas of the codimension-two surfaces P₁ and P₂ respectively. It is given by the following integral:
where θ is the expansion of the generators of the horizon, defined as:
The variation (a decrement) of the area of H as δQ crosses it δA = A₁-A₂, where A₁ and A₂ are the initial and final areas of the codimension-two surfaces P₁ and P₂ respectively.
Jacobson then makes two important assumptions: the validity of the Clausius relation, the fundamental relation connecting heat, entropy, and temperature
in the local causal horizons, and also that the system is holographic since the entropy S is proportional to the area of the horizon A:
where the constant α is universal (it should be pointed out that there is some discussion regarding the precise definition of entropy in this context). Thus, associated with the variation δA of a piece of the horizon there is a proportional entropy variation dS =αδA. The temperature is the Unruh temperature in Eq. 42.
The evolution of the congruence of null geodesics generating the horizon is described by the Raychaudhuri equation:
where σ is the shear tensor. The tensor R is the Ricci curvature tensor (the absence of the rotation term occurs because the null geodesic congruence is hypersurface orthogonal).
Integrating the Raychaudhuri equation, plugging it into the integral δA and using the proportionality between S and A we obtain:
The Clausius relation then implies that:
for any choice of null ks. Using energy-momentum conservation and some simple tensorial algebra we finally obtain Einstein’s field equations (EFE) at p:
This choice of α is needed for consistency with EFE and the Bekenstein-Hawking entropy. Note that since p is arbitrary, this result indicates that Einstein’s field equations are obeyed throughout the whole spacetime.
Thanks for reading and see you soon! As always, constructive criticism and feedback are always welcome!
My Linkedin, personal website www.marcotavora.me, and Github have some other interesting content about physics and other topics such as mathematics, machine learning, deep learning, finance, and much more! Check them out!