Norrington, Tensor Field Theory.pdf

Viewer
Transcript

AMA303: Tensor Field Theory Patrick Norrington The School of Mathematics and Physics Queens University Belfast 2003

1

CONTENTS

Contents 1 Introduction and suggested reading

13

1.1

What are tensors? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13

1.2

Historical aspect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

14

1.3

Recommended reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

15

2 Notation

18

2.1

Components of a vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

18

2.2

Summation convention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

18

2.3

Dummy index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

18

2.4

Free index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

2.5

Range convention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

2.6

Coordinate derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

2.7

Summary of notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

2.8

Greek alphabet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

20

2.9

Symmetry and skew-symmetry . . . . . . . . . . . . . . . . . . . . . . . . .

20

3 Kronecker delta, permutation symbol and determinants

21

3.1

Kronecker delta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

21

3.2

Permutation symbol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

21

3.3

Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

22

3.3.1

Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

22

3.3.2

Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

22

3.3.3

Cofactors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

23

3.3.4

Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

24

3.3.5

Worked examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

24

Generalised Kronecker delta . . . . . . . . . . . . . . . . . . . . . . . . . . .

25

3.4

4 Tensor algebra

27

4.1

Vector space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

27

4.2

Transformation of coordinates . . . . . . . . . . . . . . . . . . . . . . . . . .

27

4.3

Transformation of coordinate differentials . . . . . . . . . . . . . . . . . . .

28

4.4

Contravariant vectors

29

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

May 11, 2004 5:16pm

CONTENTS

4.5

Scalar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

29

4.6

Transformation of the gradient of a scalar field . . . . . . . . . . . . . . . .

29

4.7

Covariant vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

30

4.8

Definition of a tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

30

4.9

Kronecker delta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

31

4.10 Tensor field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

31

4.11 Linear combination of tensors . . . . . . . . . . . . . . . . . . . . . . . . . .

32

4.12 Outer product

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

32

4.13 Inner product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

32

4.14 Contraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

33

4.15 Quotient Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

34

4.16 Tensor equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

34

4.17 Symmetry and skew-symmetry . . . . . . . . . . . . . . . . . . . . . . . . .

35

5 Relative tensors

36

5.1

Transformation rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

36

5.2

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

36

5.3

Transformation of determinant . . . . . . . . . . . . . . . . . . . . . . . . .

37

5.4

Transformation of permutation symbol . . . . . . . . . . . . . . . . . . . . .

37

6 Riemannian space

38

6.1

Line element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

38

6.2

Local Cartesian coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . .

38

6.3

Spherical surface in two dimensions . . . . . . . . . . . . . . . . . . . . . . .

39

6.4

Raising and lowering indices . . . . . . . . . . . . . . . . . . . . . . . . . . .

39

6.5

Length and direction of a vector . . . . . . . . . . . . . . . . . . . . . . . .

41

6.6

Geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

42

6.7

Geodesics on the surface of a sphere . . . . . . . . . . . . . . . . . . . . . .

44

6.8

Christoffel symbols from the geodesic equations . . . . . . . . . . . . . . . .

45

3

May 11, 2004 5:16pm

CONTENTS

7 Tensor calculus

46

7.1

Gradient of a scalar field . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

46

7.2

Covariant derivative of a covariant vector . . . . . . . . . . . . . . . . . . .

46

7.3

Affinities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

48

7.4

Rules for parallel displacement . . . . . . . . . . . . . . . . . . . . . . . . .

50

7.5

Covariant derivative of a contravariant vector . . . . . . . . . . . . . . . . .

50

7.6

Covariant derivative of tensors . . . . . . . . . . . . . . . . . . . . . . . . .

51

7.7

Covariant derivative of fundamental tensor

. . . . . . . . . . . . . . . . . .

51

7.8

Product rule for covariant differentiation . . . . . . . . . . . . . . . . . . . .

51

7.9

Curvature tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

51

7.10 Symmetric affinity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

53

8 Metric affinity

54

8.1

Ricci theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

54

8.2

Formula for the metric affinity . . . . . . . . . . . . . . . . . . . . . . . . .

55

8.3

Condition for a flat space . . . . . . . . . . . . . . . . . . . . . . . . . . . .

56

9 Constant vector fields and geodesics

57

9.1

Constant vector fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

57

9.2

Geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

58

10 Covariant derivative of relative tensors

60

11 Properties of the curvature tensor

63

11.1 Geodesic coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

63

11.2 Bianchi identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

64

11.3 Symmetries of curvature tensor . . . . . . . . . . . . . . . . . . . . . . . . .

65

11.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

67

12 Ricci and Einstein tensors

68

12.1 Ricci tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

68

12.2 Divergence and Laplacian . . . . . . . . . . . . . . . . . . . . . . . . . . . .

69

12.3 Einstein tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

70

12.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

72

4

May 11, 2004 5:16pm

CONTENTS

13 Special spaces

73

13.1 Two-dimensional spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

73

13.2 Spherical surface in two-dimensions . . . . . . . . . . . . . . . . . . . . . . .

73

13.3 Spaces with constant curvature . . . . . . . . . . . . . . . . . . . . . . . . .

75

13.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

76

14 Examples of curved two-dimensional surfaces

77

14.1 Paraboloid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

77

14.1.1 Paraboloidal coordinates . . . . . . . . . . . . . . . . . . . . . . . . .

77

14.1.2 Properties of a parabola . . . . . . . . . . . . . . . . . . . . . . . . .

77

14.1.3 Coordinate surface — paraboloid . . . . . . . . . . . . . . . . . . . .

77

14.1.4 Line-element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

79

14.1.5 Determination of R1212 and K . . . . . . . . . . . . . . . . . . . . .

80

14.1.6 Curvature of paraboloid . . . . . . . . . . . . . . . . . . . . . . . . .

80

14.2 Ellipsoid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

81

14.2.1 Prolate spheroidal coordinates . . . . . . . . . . . . . . . . . . . . .

81

14.2.2 Properties of an ellipse . . . . . . . . . . . . . . . . . . . . . . . . . .

81

14.2.3 Coordinate surface — ellipsoid . . . . . . . . . . . . . . . . . . . . .

82

14.2.4 Line-element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

84

14.2.5 Determination of R1212 and K . . . . . . . . . . . . . . . . . . . . .

84

14.2.6 Curvature of ellipsoid . . . . . . . . . . . . . . . . . . . . . . . . . .

85

14.3 Hyperboloid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

86

14.3.1 Properties of a hyperbola . . . . . . . . . . . . . . . . . . . . . . . .

86

14.3.2 Coordinate surface — hyperboloid . . . . . . . . . . . . . . . . . . .

87

14.3.3 Line-element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

89

14.3.4 Determination of R1212 and K . . . . . . . . . . . . . . . . . . . . .

89

14.3.5 Curvature of hyperboloid . . . . . . . . . . . . . . . . . . . . . . . .

90

15 Cartesian tensors

91

15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

91

15.2 Orthogonal transformations . . . . . . . . . . . . . . . . . . . . . . . . . . .

92

15.2.1 Orthogonality conditions

. . . . . . . . . . . . . . . . . . . . . . . .

92

15.2.2 Orthogonal matrices . . . . . . . . . . . . . . . . . . . . . . . . . . .

92

5

May 11, 2004 5:16pm

CONTENTS

15.2.3 Orthogonal group

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

93

15.2.4 Rotations in 2 dimensions . . . . . . . . . . . . . . . . . . . . . . . .

93

15.2.5 Rotations in 3 dimensions . . . . . . . . . . . . . . . . . . . . . . . .

94

15.2.6 Eigenvalues of orthogonal matrices . . . . . . . . . . . . . . . . . . .

94

15.2.7 Summary of properties of orthogonal matrices . . . . . . . . . . . . .

95

15.3 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

96

15.4 Tensor algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

97

15.5 Tensor densities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

97

15.6 Isotropic tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

97

15.6.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

97

15.6.2 Isotropic vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

98

15.6.3 Kronecker delta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

98

15.6.4 Levi-Civita tensor density . . . . . . . . . . . . . . . . . . . . . . . .

98

15.6.5 Isotropic tensors in E3 . . . . . . . . . . . . . . . . . . . . . . . . . .

99

15.7 Tensor fields and calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . .

99

15.8 Vectors in E3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

100

15.8.1 Dot product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

100

15.8.2 Cross product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

100

15.8.3 Curl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

101

15.8.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

101

15.8.5 Vector identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

101

15.9 Rigid body motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

102

15.9.1 Space and body axes . . . . . . . . . . . . . . . . . . . . . . . . . . .

102

15.9.2 Euler’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

103

15.9.3 Rotating frames of reference . . . . . . . . . . . . . . . . . . . . . . .

104

15.9.4 Coriolis and centrifugal forces . . . . . . . . . . . . . . . . . . . . . .

105

15.9.5 Angular momentum of rigid body

. . . . . . . . . . . . . . . . . . .

105

15.9.6 Kinetic energy of rigid body . . . . . . . . . . . . . . . . . . . . . . .

106

6

May 11, 2004 5:16pm

CONTENTS

16 Special relativity

107

16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

107

16.2 Newtonian mechanics

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

108

16.3 Galilean transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

108

16.4 Principle of special relativity . . . . . . . . . . . . . . . . . . . . . . . . . .

109

16.5 Lorentz transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

109

16.6 Lorentz factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

111

16.7 Vector form of Lorentz transformation . . . . . . . . . . . . . . . . . . . . .

111

16.8 Transformation of time- and space-intervals . . . . . . . . . . . . . . . . . .

112

16.9 Velocity transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

113

16.10Time dilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

113

16.11Causality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

114

16.12Lorentz-Fitzgerald contraction

. . . . . . . . . . . . . . . . . . . . . . . . .

115

16.13Length paradox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

116

17 Minkowski space-time

117

17.1 Line-element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

117

17.2 4-position . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

117

17.3 Metric tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

117

17.4 Coordinate transformations . . . . . . . . . . . . . . . . . . . . . . . . . . .

118

17.5 Transformation of tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . .

118

17.6 Raising and lowering indices . . . . . . . . . . . . . . . . . . . . . . . . . . .

119

17.7 Inner product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

120

17.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

120

17.9 Lorentz transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

120

17.10Boost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

122

17.11Space-rotation

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

123

17.12Lorentz group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

124

17.13Poincare group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

124

17.14Boost in hyperbolic form . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

124

17.15Objective of tensor formulations

. . . . . . . . . . . . . . . . . . . . . . . .

127

17.16Minkowski diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

127

7

May 11, 2004 5:16pm

CONTENTS

18 Relativistic mechanics

130

18.1 4-velocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

130

18.2 Transformation of 4-velocity . . . . . . . . . . . . . . . . . . . . . . . . . . .

130

18.3 4-acceleration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

132

18.4 Transformation of 4-acceleration . . . . . . . . . . . . . . . . . . . . . . . .

133

18.5 4-momentum and 4-force . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

134

18.6 Mass-energy relationship . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

135

18.7 Energy-momentum relationship . . . . . . . . . . . . . . . . . . . . . . . . .

137

18.8 Transformation of 4-momentum . . . . . . . . . . . . . . . . . . . . . . . . .

138

18.9 Conservation of 4-momentum . . . . . . . . . . . . . . . . . . . . . . . . . .

138

18.10Photons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

139

18.11Potential energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

139

19 Collision examples

141

19.1 Elastic collision of two equal particles . . . . . . . . . . . . . . . . . . . . .

141

19.2 Elastic collision of two unequal particles . . . . . . . . . . . . . . . . . . . .

146

19.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

150

20 Energy-momentum tensor

155

20.1 Volume elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

155

20.2 4-force density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

155

20.3 General case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

155

20.4 External force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

156

20.5 Incoherent cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

157

20.6 Fluids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

157

20.7 Perfect fluids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

158

21 Electromagnetism

159

21.1 Maxwell equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

159

21.2 Continuity equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

161

21.3 Lorentz force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

161

21.4 Scalar and vector potentials . . . . . . . . . . . . . . . . . . . . . . . . . . .

161

21.5 Gauge transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

162

21.6 Lorentz gauge and wave-equations . . . . . . . . . . . . . . . . . . . . . . .

162

21.7 Coulomb gauge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

162

21.8 Exercises

163

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

May 11, 2004 5:16pm

CONTENTS

22 Tensor formulation of electromagnetism 22.1 4-current

164

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

164

22.2 Continuity equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

164

22.3 4-potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

164

22.4 Electromagnetic tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

165

22.5 Maxwell equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

165

22.6 Dual electromagnetic tensor density . . . . . . . . . . . . . . . . . . . . . .

166

22.7 Invariants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

167

22.8 Transformation of fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

167

22.9 Fields produced by a moving charge . . . . . . . . . . . . . . . . . . . . . .

169

22.10Lorentz force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

176

22.11Electromagnetic energy tensor . . . . . . . . . . . . . . . . . . . . . . . . . .

176

22.12Exercises

178

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

23 Wave motion

179

23.1 Wave-equation and solution . . . . . . . . . . . . . . . . . . . . . . . . . . .

179

23.2 Transformation of 4-propagation . . . . . . . . . . . . . . . . . . . . . . . .

179

23.3 Doppler effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

181

24 Quantum theory

183

24.1 de Broglie waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

183

24.2 The quantum recipe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

184

24.3 Schr¨odinger equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

184

24.4 Klein-Gordon equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

185

24.5 Dirac equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

185

24.6 Charged particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

186

25 Introduction to General Relativity

188

25.1 Absolute space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

188

25.2 Mach’s Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

188

25.3 Equivalence Principle

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

189

25.4 General Relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

189

25.5 Weak Equivalence Principle . . . . . . . . . . . . . . . . . . . . . . . . . . .

190

25.6 Einstein Equivalence Principle

. . . . . . . . . . . . . . . . . . . . . . . . .

190

25.7 Metric theory of gravitation . . . . . . . . . . . . . . . . . . . . . . . . . . .

190

25.8 Tests of General Relativity

191

. . . . . . . . . . . . . . . . . . . . . . . . . . . 9

May 11, 2004 5:16pm

CONTENTS

26 Equations of General Relativity

193

26.1 Newtonian gravitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

193

26.2 Vacuum field equations of General Relativity . . . . . . . . . . . . . . . . .

194

26.3 Newtonian limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

196

26.4 Gravitational red-shift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

198

26.5 Examples of gravitational red-shift . . . . . . . . . . . . . . . . . . . . . . .

199

26.6 Gravitational red-shift using Einstein Equivalence Principle . . . . . . . . .

200

26.7 Energy-momentum tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . .

201

26.8 Field equations of General Relativity

. . . . . . . . . . . . . . . . . . . . .

202

26.9 Alternative forms for the field equations . . . . . . . . . . . . . . . . . . . .

203

26.10Identification of the constant κ . . . . . . . . . . . . . . . . . . . . . . . . .

203

27 Black-holes

204

27.1 Schwarzschild solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

204

27.2 Time-like geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

205

27.3 Orbit equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

207

27.4 Advance in the perihelion of planets . . . . . . . . . . . . . . . . . . . . . .

208

27.5 Null geodesics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

210

27.6 Deflection of a light ray near the sun . . . . . . . . . . . . . . . . . . . . . .

211

27.7 Black-holes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

212

27.8 Eddington form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

213

27.9 Radial motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

213

28 Cosmology

214

28.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

214

28.2 A little astronomy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

214

28.3 Copernican principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

215

28.4 Red-shift

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

216

28.5 Background microwave radiation . . . . . . . . . . . . . . . . . . . . . . . .

217

28.6 Age of Universe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

217

28.7 Robertson-Walker metric

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

218

28.8 Spatial geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

218

28.9 Scale factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

219

10

May 11, 2004 5:16pm

CONTENTS

28.10Field equations of General Relativity

. . . . . . . . . . . . . . . . . . . . .

219

28.11Energy-momentum tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . .

220

28.12Evaluation of the Ricci tensor . . . . . . . . . . . . . . . . . . . . . . . . . .

221

28.13Conservation of energy-momentum . . . . . . . . . . . . . . . . . . . . . . .

221

28.14Equation of state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

222

28.15Friedmann equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

223

28.16Summary of formulae

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

224

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

224

28.18‘Big bang’ model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

226

28.19Matter-dominated model . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

227

28.17Model parameters

A 3-dimensional vectors

229

A.1 Cartesian vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

229

A.2 Vector products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

229

A.3 Gradient operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

230

A.4 Vector identities involving the gradient . . . . . . . . . . . . . . . . . . . . .

232

A.5 Vector theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

233

A.6 Curvilinear coordinates

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

234

A.7 Orthogonal curvilinear coordinates . . . . . . . . . . . . . . . . . . . . . . .

235

A.8 Cylindrical polar coordinates . . . . . . . . . . . . . . . . . . . . . . . . . .

235

A.9 Spherical polar coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . .

237

B Odds and ends

239

C Biography of Maxwell

246

D Biography of Einstein

251

E Simultaneity

258

F Electromagnetism using SI units

262

G Global Positioning System

267

G.1 Einstein’s Relativity and Everyday Life by Clifford M. Will . . . . . . . . .

267

G.2 General relativity in the global positioning system by Neil Ashby

. . . . .

268

G.3 Some Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

272

11

May 11, 2004 5:16pm

LIST OF FIGURES

List of Figures 1

Parabola y 2 = 4ax with a = 2 . . . . . . . . . . . . . . . . . . . . . . . . . .

78

2

Family of confocal parabolas labelled by β . . . . . . . . . . . . . . . . . . .

79

3

Ellipse

=1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

82

4

Family of confocal ellipses labelled by α . . . . . . . . . . . . . . . . . . . .

83

5

K as a function of β for various α . . . . . . . . . . . . . . . . . . . . . . .

85

6

Hyperbola

=1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

87

7

Family of confocal hyperbolas labelled by β . . . . . . . . . . . . . . . . . .

89

8

Hyperbolic functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

125

9

Minkowski diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

129

10

Ratio of relativistic and non-relativistic kinetic energy (energy in units of the rest energy). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

137

11

Collision between two equal particles . . . . . . . . . . . . . . . . . . . . . .

142

12

Collision between equal masses, v/c = .8. Dashed curves are non-relativistic, solid curves are relativistic. Quantities are plotted as a function of α in the range 0 to π. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

151

Collision between equal masses, v/c = .99. Dashed curves are non-relativistic, solid curves are relativistic. Quantities are plotted as a function of α in the range 0 to π. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

152

Collision between unequal masses (m 2 /m1 = 2), v/c = .8. Dashed curves are non-relativistic, solid curves are relativistic. Quantities are plotted as a function of α in the range 0 to π. . . . . . . . . . . . . . . . . . . . . . . .

153

Collision between unequal masses (m 2 /m1 = 2), v/c = .99. Dashed curves are non-relativistic, solid curves are relativistic. Quantities are plotted as a function of α in the range 0 to π. . . . . . . . . . . . . . . . . . . . . . . .

154

16

Moving charge in frame S . . . . . . . . . . . . . . . . . . . . . . . . . . . .

170

17

Field lines for a charge at rest . . . . . . . . . . . . . . . . . . . . . . . . . .

172

18

Field lines for a moving charge . . . . . . . . . . . . . . . . . . . . . . . . .

173

19

Plot of f against w = ct for b = 1 and a range of v/c–values . . . . . . . . .

174

20

Plot of g against w = ct for b = 1 and a range of v/c–values . . . . . . . . .

175

21

Cylindrical polar coordinates . . . . . . . . . . . . . . . . . . . . . . . . . .

236

22

Spherical polar coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . .

237

13

14

15

x2 a2

+

y2 b2

x2 a2

−

y2 b2

12

May 11, 2004 5:16pm

1

INTRODUCTION AND SUGGESTED READING

1

Introduction and suggested reading

1.1

What are tensors?

In dynamics we are familiar with the notion of scalar and vector quantities. Examples of scalars are the mass m of a particle, the speed v and the kinetic energy 12 mv 2 . A scalar has a single numerical value. In contrast a vector has three numbers associated with it. Examples from dynamics are a particle’s position r, velocity v and acceleration f . Force F is also a vector. Newton’s 2nd law is a vector equation that relates the vector force to the vector acceleration. F = mf

(1)

A particle’s position r is often given as 3 components (x, y, z) referred to a right-handed rectangular set of axes. These are the well-known Cartesian 1 components. Equally well the position can be expressed in terms of either spherical polar coordinates (r, θ, φ) or cylindrical polar coordinates (ρ, φ, z). These are the most familiar (and useful) coordinates but there is no limit to the possibilities. We are at liberty to choose the coordinates and of course one should select those that make the given problem easiest. The motion does not depend on the choice of reference axes. Thus the equations eq.(1) that describe the motion are independent of coordinate system. The vector components in one coordinate system will transform in a prescribed way to another coordinate system so as to ensure that this independence is preserved. The key idea is that of invariance with respect to coordinate transformations. A tensor is a set of quantities that transform in a prescribed manner when a coordinate transformation is made. It is a generalisation of the notion of a vector. A scalar, which is unchanged by coordinate transformation, is a tensor of rank 0. A vector is a tensor of rank 1. Higher rank tensors occur. An example of a tensor of rank 2 from dynamics (rigid body motion) is the inertia tensor. The inertia tensor can be written as a matrix with 9 components. A tensor of rank r in a space of dimension N has N r components. Tensors are the appropriate objects for describing many physical phenomena, such as solid and fluid mechanics, elasticity, special and general relativity. They allow coordinate transformations of any type. A special case is to restrict the type of coordinates to being Cartesian, i.e. rectangular axes. The possible transformations are then rotations and/or reflections so that the reference axes remain orthogonal. Cartesian tensors are defined by their transformation properties under orthogonal transformations. While the theory of Cartesian tensors is simpler than that of general tensors, it still has wide application in physics. In general we consider any type of coordinate transformation. This allows us to treat curvilinear coordinates, such as spherical polar, and to analyse the properties of more general spaces than Euclidean. Tensors can be defined in quite general vector spaces but our attention will be restricted to Riemannian 2 spaces. 1 2

See Appendix B ‘Odds and ends’ note 1. See Appendix B ‘Odds and ends’ note 2.

13

May 11, 2004 5:16pm

1

INTRODUCTION AND SUGGESTED READING

1.2

Historical aspect

The following brief historical introduction is taken from ‘Elementary Vector Analysis’ by CE Weatherburn, 1955, Bell & Sons. During the last sixty or seventy years there has appeared a broad generalisation of vector analysis under the name of Tensor Analysis, which sprang from the study of differential geometry of multidimensional space. The history of differential geometry of spaces of more than three dimensions may be said to have begun with a paper by Bernhard Riemann (1826-1866) on the hypotheses which lie at the foundation of geometry, read before the Philosophical Faculty of the University of G¨ottingen in 1854, but not published till 1868, after Riemann’s death. Riemannian geometry is based on the assumption that the square of the linear element is represented by a quadratic differential form, usually called a Riemannian metric. The corresponding Riemannian space, which may be of any number of dimensions, is a generalisation of Euclidean space of three dimensions. During the twenty years immediately following Riemann’s death several mathematicians contributed to the development of the subject; but the year 1887 is especially memorable for the publication by Gregorio Ricci (1853-1924) of his first short note dealing with the calculus which is now known variously as the Ricci calculus, the Absolute Differential Calculus or the Calculus of Tensors. In this calculus scalars appear as tensors of order zero; covariant and contravariant vectors, which are a generalisation of the vectors of Euclidean 3-space, are tensors of order one. The dyads and dyadics of vector analysis are examples of tensors of order two. The order of a tensor may be any positive integral number. During the ensuing decade Ricci ‘elaborated the theory, and worked out the elegant and comprehensive notation which enables it to be easily adapted to a wide variety of questions of analysis, geometry and physics’ (quote from the text ‘The Absolute Differential Calculus’ by T Levi-Civita, 1927, Blackie & Son). Not only did this new analysis greatly simplify Riemannian geometry, but it also led to a wider extension of the field of research; and, during the thirty years from 1887, Ricci and the Italian school of mathematicians contributed very largely to this branch of mathematics. For some years Ricci’s labours and his new method attracted the attention of only a small number of mathematicians; but the publication in 1913 and 1916 of Albert Einstein’s first papers in general relativity focused the attention of mathematicians on Ricci’s calculus. Einstein assumed a Riemannian space of four dimensions as the basis of his general theory, and found in the absolute differential calculus the best instrument for formulating his ideas. Since then the tensor calculus has been used extensively by mathematicians and physicists, and has proved itself a useful and powerful instrument of research. The following quote is taken from ‘Space–Time–Matter’ by H Weyl, Dover 1952 reprint of 1922 4th edition.

14

May 11, 2004 5:16pm

1

INTRODUCTION AND SUGGESTED READING

The study of tensor calculus is, without doubt, attended by conceptual difficulties — over and above the apprehension inspired by indices — which must be overcome. From the formal aspect, however, the method of reckoning used is of extreme simplicity; it is much easier than, e.g., the apparatus of elementary vector calculus. There are two operations, multiplication and contraction; i.e., putting the components of two tensors with totally different indices alongside one another; the identification of an upper index with a lower one, and, finally summation (not expressed) over this index. Various attempts have been made to set up a standard terminology in this branch of mathematics involving only the vectors themselves and not their components, analogous to that of vectors in vector analysis. This is highly expedient in the latter, but very cumbersome for the much more complicated framework of the tensor calculus. In trying to avoid continual reference to the components we are obliged to adopt an endless profusion of names and symbols in addition to an intricate set of rules for carrying out calculations, so that the balance of advantage is considerably on the negative side. An emphatic protest must be entered against these orgies of formalism which are threatening the peace of even the technical scientist. Lord Kelvin, whose statue dominates the entrance to Botanic Gardens, had an interesting view on the value of vectors 3 .

1.3

Recommended reading

This module provides an introduction to tensor calculus and applies it to special and general relativity. The module was originally based on the text ‘An Introduction to Tensor Calculus, Relativity and Cosmology’ by DF Lawden (3rd edition, Wiley, 1982). The most important change is the notation used for special relativity. However this remains an excellent introduction to the subject. Our treatment of tensor calculus is an old-fashioned one that concentrates on the coordinate representation of the tensor quantities. This is a constraint imposed by lack of time and student background. When Einstein published his classic 1916 paper on general relativity, he devoted a large section to outlining tensor calculus. It is a clear and concise exposition. This was necessary because the mathematics was unfamiliar to his audience of physicists. A translation of the paper (Annalen der Physik, 49, 1916) can be found in ‘The Principle of Relativity’ by HA Lorentz, A Einstein, H Minkowski and H Weyl (Dover, 1952). This text is a reprint of the papers that founded the theory of relativity and as such represent one of mankind’s great cultural achievements. A proper test of your understanding of this module is the facility with which you can read these original papers. Einstein’s problem with his audience is reflected in all texts on relativity. Tensor calculus is described in either the introductory chapters or as an appendix. While the notation and 3

See Appendix B ‘Odds and ends’ note 3.

15

May 11, 2004 5:16pm

1

INTRODUCTION AND SUGGESTED READING

terminology is fairly standard, there are differences in how it is applied to relativity. The authors must establish their conventions and the readers must take proper note of them. The text ‘Tensor Calculus’ by JL Synge and A Schild (University of Toronto, 1949 and Dover, 1978), adopts the coordinate representation with applications in classical mathematical physics but avoiding relativity. The treatment of tensor calculus is particularly well done. ‘Space-Time Structure’ by Erwin Schr¨odinger (Cambridge University Press, 1950) concentrates on general relativity. This is quite short and very well-written. The text by Lovelock and Rund is excellent for the theory of tensors and how it generalises to manifolds and differential forms. The modern treatment of tensors is based on the differential geometry due to Cartan 4 . Tensors represent geometric entities in the study of manifolds. A global study of the manifolds is essentially coordinate independent. This introduces the notions of tangent and cotangent spaces, forms, Lie derivatives and bundles. This approach can be found in the text ‘Gravitation’ by CW Misner, KS Thorne and JA Wheeler (Freeman, 1973). This is widely regarded as the best book on relativity. It is not for the faint-hearted, extending to almost 1300 pages. But as a reference it is unsurpassed. The style is vivid with plenty of illustrations. Any difficulty with the basic concepts could be resolved by studying this book. A gentler introduction to modern differential geometry is ‘Geometrical Methods of Mathematical Physics’ by BF Schutz (Cambridge University, 1980). A complementary book is ‘A First Course in General Relativity’ by BF Schutz (Cambridge University, 1985). Other texts include ‘An Introduction to General Relativity’ by LP Hughston and KP Tod (Cambridge University, 1991) and ‘Introducing Einstein’s Relativity’ by RA d’Inverno (Clarendon, 1992). The text ‘General Relativity’ by H Stephani (2nd edition, Cambridge University, 1990) can also be recommended. The recent texts contain a discussion of more up-to-date experimental evidence in support of relativity. Chandrasekhar’s book on black-holes is a classic but hard going. This is an advanced text suitable for postgraduates.

The concepts of relativity are difficult and background reading is important to complement the more terse presentation of the material in the lectures. There is insufficient time to go into lengthy explanations and you can get this in these texts. Certainly you should at least browse some of these books in the library.

4

See Appendix B ‘Odds and ends’ note 4.

16

May 11, 2004 5:16pm

1

INTRODUCTION AND SUGGESTED READING

text

QUB library

‘An Introduction to Tensor Calculus, Relativity and Cosmology’ by DF Lawden (3rd edition, Wiley, 1982) ‘The Principle of Relativity’ by HA Lorentz, A Einstein, H Minkowski and H Weyl (Dover, 1952) ‘Tensor Calculus’ by JL Synge and A Schild (University of Toronto, 1949 and Dover, 1978) ‘Space-Time Structure’ by Erwin Schr¨ odinger (Cambridge University Press, 1950) ‘Tensors, Differential Forms, and Variational Principles’ by D Lovelock and H Rund (Dover, 1975) ‘Gravitation’ by CW Misner, KS Thorne and JA Wheeler (Freeman, 1973) ‘Geometrical Methods of Mathematical Physics’ by BF Schutz (Cambridge University, 1980) ‘A First Course in General Relativity’ by BF Schutz (Cambridge University, 1985) ‘General Relativity’ by H Stephani (2nd edition, Cambridge University, 1990) ‘An Introduction to General Relativity’ by LP Hughston and KP Tod (Cambridge University, 1991) ‘Introducing Einstein’s Relativity’ by RA d’Inverno (Clarendon, 1992) ‘The Mathematical Theory of Black Holes’ by S Chandrasekhar (Clarendon, 1983)

QA433/LAWD

availability on www.amazon.co.uk no

QC11/LORE

£5

QA433/SYNG

£7.80

QC11/SCHR

£16.95

no

£8.35

QC178/MISN

£67.99

QC20.7.D52/SCHU

£20.95

QC173.6/SCHU

£21.95

QC173.55/STEP

£25.95

QC173.6/HUGH

£16.95

QC173.55/DINV

£29.95

QB843.B55/CHAN

£22.95

Note: • Availability on AMAZON can be deceptive. It is only after you actually order the book that you learn its true availability. Having said that I have always found it good. • These prices were checked in January 2003.

17

May 11, 2004 5:16pm

2

NOTATION

2 2.1

Notation Components of a vector

You should be familiar with the properties of 3-dimensional vectors, see Appendix A. In an N -dimensional space a vector will have N components. We will label these by an index. Normally one would use a subscript: vector A has components (A 1 , A2 · · · AN ). However in general tensor theory both subscripts and superscripts are used to identify the components. So vector A also has components (A 1 , A2 · · · AN ). Ai , (i = 1 · · · N ) are the up-components, called contravariant. Ai , (i = 1 · · · N ) are the down-components, called covariant.

In general Ai 6= Ai but there may be a relationship between them. One problem with using superscripts as an index is that they can be confused with powers. You must be careful with this. For example, use brackets, (A1 )2 = A1 × A1

2.2

(2)

Summation convention

Tensor analysis is characterised by lots of indices and summations. To enable the easy handling of these we introduce the summation convention due to Einstein. An equation is an expression set to zero. An expression is the sum of a number of terms. In a term a repeated index, one a subscript and the other a superscript, implies a summation. For example Aij X j ≡

N X

Aij X j

(3)

j=1

where the repeated index is j. The upper limit of the summation is the dimension of the space we are considering i.e. in an N -dimensional space j = 1, 2 · · · N . This is a very compact notation and care should be taken with it. At times it is inconvenient and then it can be suspended. This should be stated explicitly.

2.3

Dummy index

A repeated index is known as a dummy index and in any term it can only occur twice. A dummy index (like an integration variable) can be replaced by another index provided that index does not already occur in the term. Aij X j = Aik X k

18

(4)

May 11, 2004 5:16pm

2

NOTATION

2.4

Free index

Other indices are known as free indices. In tensor equations it is important to ensure that the free covariant and contravariant indices match in the various terms. Aji

j = Bik X k + Dij

=

j Bim Xm

+

(5)

Dij

Here k, m are used as dummy indices, i is a covariant free index and j is a contravariant free index.

2.5

Range convention

When we write B i = Aij X j

(6)

then we mean any of the equations i = 1, 2 · · · N .

2.6

Coordinate derivatives

Tensor quantities are often functions of the coordinates x i . The coordinates are always written with a superscript. Partial derivatives with respect to x i are written as ∂i . For example ∂Ai = ∂ j Ai ∂xj

2.7

(7)

Summary of notation

• Tensors use both subscripts and superscripts • A subscript is called covariant • A superscript is called contravariant • A repeated (dummy) index, one a subscript and the other a superscript, implies a summation • A dummy index can only occur twice in any term • A dummy index can be replaced by another index provided that index does not already occur in the term • Free indices match in the terms of an equation 19

May 11, 2004 5:16pm

2

NOTATION

2.8

Greek alphabet

Greek letters are often used in relativity. You should expect to meet these in indices. Just in case . . . alpha delta eta kappa nu pi tau chi

2.9

α δ η κ ν π or $ τ χ

∆

Π

beta epsilon theta lambda xi rho upsilon psi

β or ε θ or ϑ λ ξ ρ or % υ ψ

Θ Λ Ξ Υ Ψ

gamma zeta iota mu o sigma phi omega

γ ζ ι µ o σ or ς φ or ϕ ω

Γ

Σ Φ Ω

Symmetry and skew-symmetry

A quantity is symmetric in two indices of the same type if it is unchanged when the indices are interchanged i.e. Tijk is symmetric in i and j if Tijk = Tjik . By completely symmetric we mean that the quantity is symmetric in all pairs of indices. Similarly a quantity is skew-symmetric in two indices of the same type if it changes sign when the indices are interchanged i.e. T ijk is skew-symmetric in i and j if Tijk = −Tjik . If Tij is skew-symmetric then Tij = 0 when i = j. We can write any Tij as the sum of a symmetric part and a skew-symmetric part Tij = 12 (Tij + Tji ) + 21 (Tij − Tji )

(8)

Symmetry and skew-symmetry are invariant properties of tensors. Invariant means that the property is preserved when any coordinate transformation is made. It is important to exploit any such properties as they reduce the number of non-zero independent components. For example if N = 3 a skew-symmetric quantity T ij has only 3 non-zero independent components, namely T 12 , T13 and T23 .

20

May 11, 2004 5:16pm

3

KRONECKER DELTA, PERMUTATION SYMBOL AND DETERMINANTS

3

Kronecker delta, permutation symbol and determinants

3.1

Kronecker delta

The Kronecker

5

delta is defined by δij =

(

1 if i = j 0 otherwise

(9)

A similar definition holds for δ ij and δji . • δii = N • δji δkj = δki • δji Aj = Ai • δji is an isotropic, type (1, 1) tensor Isotropic means that the components are unchanged by any coordinate transformation.

3.2

Permutation symbol

A permutation of N objects is an arrangement or a rearrangement of the N objects. The number of possible permutations is N !. Starting from arrangement A then we can obtain arrangement B by the repeated procedure of swapping two objects. If the number of swaps required is an even number then we say that B is an even permutation of A. If the number is odd then we have an odd permutation. For example, if arrangement A is 1234 then 3124 is an even permutation since it can be obtained by two swaps : 1234 → 2134 → 3124. The permutation 4123 is odd since you require three swaps : 1234 → 4231 → 4213 → 4123. The permutation symbol is defined by ei1 ···iN =

  

1 −1   0

if i1 · · · iN is an even permutation of 1 · · · N if i1 · · · iN is an odd permutation of 1 · · · N otherwise

(10)

A similar definition holds for ei1 ···iN . • It has N indices • e1···N = 1 and e1···N = 1 • It is completely skew-symmetric 5

See Appendix B ‘Odds and ends’ note 5.

21

May 11, 2004 5:16pm

3

KRONECKER DELTA, PERMUTATION SYMBOL AND DETERMINANTS

• It vanishes if an index is repeated • When N = 2, e12 = −e21 = 1 and e11 = e22 = 0 • When N = 3, there are 6 non-zero values e 123 = e231 = e312 = 1 ( i.e. indices are cyclic in 123), and e132 = e213 = e321 = −1 h

i

i

−1 iN −1 iN • ei1 ···iN −2 iN −1 iN ei1 ···iN −2 jN −1 jN = (N − 2)! δjN δ − δjN δjN −1 N −1 jN N indices

i

i.e. N − 2 dummy

• ei1 ···iN −1 iN ei1 ···iN −1 jN = (N − 1)! δjiN i.e. N − 1 dummy indices N • ei1 ···iN ei1 ···iN = N ! i.e. N dummy indices • ei1 ···iN is an isotropic, type (N, 0) relative tensor of weight 1 (or tensor density) • ei1 ···iN is an isotropic, type (0, N ) relative tensor of weight −1

3.3 3.3.1

Determinants Definition

If (aij ) is an N × N matrix (i labels the row and j labels the column) then its determinant is given by a = det(aij )

=

=

a 11 a12 . . . a1N X

j1 j2 ···jN

(11) a12 a22 .. .

a2N

··· ···

a1N a2N .. .

··· · · · aN N

ej1 j2 ···jN a1j1 a2j2 · · · aN jN

The summation sign is retained for the present. 3.3.2

Properties

If A, B are square matrices and T denotes matrix transpose, then the following are properties of determinants: • det(AT ) = det(A) • det(AB) = det(A) det(B) • det(A−1 ) = 1/ det(A) • det(A) changes sign if two rows (or columns) are interchanged 22

May 11, 2004 5:16pm

3

KRONECKER DELTA, PERMUTATION SYMBOL AND DETERMINANTS

• det(A) = 0 if two rows (or columns) are the same • det(λA) = λN det(A) where A is an N × N matrix • If a row (column) is multiplied by λ then the determinant is multiplied by λ • The determinant is unchanged if a multiple of a row (or column) is added to another row (or column) • We can add/subtract determinants by the decomposition of a row (or column) 3.3.3

Cofactors

Let k be a fixed index with 1 ≤ k ≤ N . Then since each term in this expansion has only one factor from row k, we can write the determinant as a = ak1 A1k + ak2 A2k + · · · + akN AN k =

X

akj A

(12)

jk

j

where Ajk is the cofactor of element akj . The summation convention is not applied here. Ajk is the coefficient of akj in the expansion of the determinant. Since k is held fixed, this formula corresponds to an expansion by row k of the determinant. A jk is (−1)k+j times the (N − 1) × (N − 1) determinant obtained from a by deleting row k and column j. Analogous formulae are ej1 j2 ···jN aj1 1 aj2 2 · · · ajN N

X

a =

j1 j2 ···jN

(13)

= a1k Ak1 + a2k Ak2 + · · · + aN k AkN =

X

ajk Akj

j

corresponding to an expansion by column k of the determinant. The summation convention is not applied here. Consider X

akj Ajh

(14)

j

when k 6= h. This is the determinant of a matrix in which two rows are the same. It is therefore zero. Therefore we can write aδkh = akj Ajh = ajk A

(15)

hj

using the summation convention. 23

May 11, 2004 5:16pm

3

KRONECKER DELTA, PERMUTATION SYMBOL AND DETERMINANTS

3.3.4

Results

• a = ej1 j2 ···jN a1j1 a2j2 · · · aN jN = ej1 j2 ···jN aj1 1 aj2 2 · · · ajN N • ei1 i2 ···iN a = ej1 j2 ···jN ai1 j1 ai2 j2 · · · aiN jN = ej1 j2 ···jN aj1 i1 aj2 i2 · · · ajN iN • aδkh = akj Ajh = ajk Ahj • Ajk =

∂a ∂akj

• If the elements depend on coordinates x i , then ∂i a = Ajk ∂i akj . • If a 6= 0, then the inverse of (aij ) is (aij ) with aij = Aij /a. • If N = 2, then (aij ) =

a11 a12 a21 a22

!

−→

1 (aij ) = a

a22 −a12 −a21 a11

!

(16)

where a = a11 a22 − a12 a21 . • If (aij ) is symmetric, then the inverse (aij ) is symmetric. • If aij = Bi Bj , then a = 0. • If (aij ) is skew-symmetric and N is odd, then a = 0. • det(Aij ) = aN −1 • If aij is a type (0, 2) tensor, then aij is type (2, 0) tensor. • If aij is a type (0, 2) tensor, then a is relative scalar of weight 2. 3.3.5

Worked examples

We will choose N = 3 but the results can be easily generalised to N > 3 • To show that a determinant changes sign when two rows are interchanged det(A) = eijk a1i a2j a3k = −e

= −e

jik

a1i a2j a3k

jik

a2j a1i a3k

(17)

= − det(A0 ) where A0 is obtained from A by interchanging rows 1 and 2.

24

May 11, 2004 5:16pm

3

KRONECKER DELTA, PERMUTATION SYMBOL AND DETERMINANTS

• To show that det(AB) = det(A) det(B), let C = AB det(C) = eijk c1i c2j c3k = e = = =

ijk

(a1l δ

lm

(18)

bmi )(a2n δ

no

pq

boj )(a3p δ bqk ) lm no pq (e bmi boj bqk ) δ δ δ a1l a2n a3p emoq det(B) δ lm δ no δ pq a1l a2n a3p elnp det(B) a1l a2n a3p ijk

= det(A) det(B) • To show that det(AT ) = det(A)

det(AT ) = eijk AT

1i

AT

= eijk ai1 aj2 ak3

2j

AT

(19)

3k

= det(A)

3.4

Generalised Kronecker delta

The generalised Kronecker delta is defined by

r δji11 ij22···i ···jr

=

δji11 δji21 .. .

δji12 δji22 .. .

δjir1

δjir2

· · · δji1r · · · δji2r . · · · ..

· · · δjirr

• When r = 1 we have the Kronecker delta.

(20)

• When r = 2 we have δji11 ij22 = δji11 δji22 − δji12 δji21

(21)

• When r = 3 we have δji11 ij22ij33

= δji11 δji22 ij33 − δji21 δji12 ij33 + δji31 δji12 ij23

(22)

= δji11 δji22 δji33 − δji11 δji32 δji23 + δji31 δji12 δji23 − δji21 δji12 δji33 + δji21 δji32 δji13 − δji31 δji22 δji13

r • In general δji11 ···i ···jr is the sum of r! terms, each of which is the product of r Kronecker deltas. r • δji11 ···i ···jr is skew-symmetric under interchange of any two subscripts (or superscripts).

12···r = 1 • δ12···r r • If any two subscripts (or superscripts) are the same, then δ ji11 ···i ···jr = 0.

25

May 11, 2004 5:16pm

3

KRONECKER DELTA, PERMUTATION SYMBOL AND DETERMINANTS

r • If r > N , then δji11 ij22···i ···jr = 0. We assume below that r ≤ N .

i ···i

i

i ···i

r • δj11 ···jr−1 = (N − r + 1) δj11 ···jr−1 r−1 ir r−1

i ···i i

···i

(N −s)! (N −r)!

i ···i i

···i

(N −r+t)! (N −r)!

r • δj11 ···jss is+1 = s+1 ···ir r • δi11···ittjt+1 = t+1 ···jr

s δji11 ···i ···js

i

···i

r δjt+1 t+1 ···jr

···ir • Taking t = r gives δii11···i = r

N! (N −r)!

i1 ···iN • ei1 ···iN = δi1···N and ei1 ···iN = δ1···N 1 ···iN N • ei1 ···iN ej1 ···jN = δji11 ···i ···jN

i

···i

N • ei1 ···it it+1 ···iN ei1 ···it jt+1 ···jN = t! δjt+1 t+1 ···jN

• det(aij ) =

1 N!

j1 jN N δji11 ···i ···jN ai1 · · · aiN

r j1 ···jr = r! Ai1 ···ir . • If Ai1 ···ir is completely skew-symmetric, then δ ji11 ···i ···jr A r • If Bi1 ···ir is completely skew-symmetric, then δ ji11 ···i ···jr Bi1 ···ir = r! Bj1 ···jr . r • δji11 ···i ···jr is an isotropic, type (r, r) tensor.

i ···i

• If Ti1 ···ir is a type (0, r) tensor field, then δj11 ···jr+1 ∂ T is a type (0, r + 1) tensor r+1 ir+1 i1 ···ir field. It is skew-symmetric. • When r = 1 S j1 j2

= δji11 ij22 ∂i2 Ti1 =

(δji11 δji22

−

δji12 δji21 )

(23) ∂ i2 T i1

= ∂ j2 T j1 − ∂ j1 T j2 • When r = 2 S j1 j2 j3

= δji11 ij22ij33 ∂i3 Ti1 i2 =

h

(24) i

δji11 δji22 δji33 − δji11 δji32 δji23 + δji31 δji12 δji23 − δji21 δji12 δji33 + δji21 δji32 δji13 − δji31 δji22 δji13 ∂i3 Ti1 i2

= ∂ j3 T j1 j2 − ∂ j2 T j1 j3 + ∂ j1 T j2 j3 − ∂ j3 T j2 j1 + ∂ j2 T j3 j1 − ∂ j1 T j3 j2 = ∂j3 (Tj1 j2 − Tj2 j1 ) + ∂j1 (Tj2 j3 − Tj3 j2 ) + ∂j2 (Tj3 j1 − Tj1 j3 )

26

May 11, 2004 5:16pm

4

TENSOR ALGEBRA

4

Tensor algebra

4.1

Vector space

A point is N ordered real numbers (an N -tuple) written with the N coordinates as (x 1 , x2 , . . . xN ) or simply xi . The coordinates are independent so that ∂xi = δji ∂xj

(25)

The totality of all points, in specified ranges, constitutes a space of N dimensions, denoted VN . A curve is the totality of points given by: xi = f i (u)

(26)

where u is a parameter and f i are N functions. A subspace VM is given by xi = f i (u1 , u2 , . . . uM )

(27)

where uj are M parameters and M < N . VN −1 is a surface. A surface divides the adjacent space into two parts.

4.2

Transformation of coordinates

Consider a general transformation of coordinates xi = f i (x1 , x2 , . . . xN )

(28)

The functions f i are single-valued, continuous and differentiable. It is more convenient to replace f i by xi so that xi = xi (x1 , x2 , . . . xN )

(29)

For the transformation to be useful it is important that the inverse exists xi = xi (x1 , x2 , . . . xN )

(30)

A point in the space can be uniquely represented either by (x 1 , x2 , . . . xN ) or (x1 , x2 , . . . xN ). The condition for this is that the Jacobian 6 is non-zero. If we define Jji = 6

∂xi ∂xj

(31)

See Appendix B ‘Odds and ends’ note 6.

27

May 11, 2004 5:16pm

4

TENSOR ALGEBRA

then the Jacobian is J

= det(Jji )

(32)

∂x1 1

=

∂x1

...

∂x

∂xN

.. .

.. .

∂xN ∂x1

∂xN ∂xN

...

The Jacobian for the inverse transformation is

i

J = det(J j )

(33)

where i

Jj =

∂xi ∂xj

(34)

Consider a

Jai J j

∂xi ∂xa ∂xa ∂xj ∂xi by the chain rule = ∂xj = δji =

(35)

i

Therefore the matrix (Jji ) is the inverse of the matrix (J j ) and taking determinants JJ = 1. We consider only transformations for which J 6= 0 and J 6= ∞. For a given transformation the Jacobian will be a function of the coordinates. Because of the continuity (smoothness) of the coordinates and the condition that the Jacobian is non-zero, it follows that the Jacobian will either be always positive or always negative.

4.3

Transformation of coordinate differentials

The coordinate differentials transform as dxi =

∂xi dxa ∂xa

(36)

The coefficients are the elements of the Jacobian J. The transformation is affine i.e. it is linear and homogeneous. Also it is transitive. By this we mean that the transformation from x i to y i followed by the transformation from y i to z i gives the same result as the transformation from x i to z i . Given dy i =

∂y i dxa ∂xa

and 28

dz i =

∂z i dy a ∂y a

(37) May 11, 2004 5:16pm

4

TENSOR ALGEBRA

then substitute for dy a dz i = = =

∂z i ∂y a ∂z i ∂y a ∂z i ∂xb

dy a

(38)

∂y a b dx ∂xb dxb

giving the transformation from xi to z i .

4.4

Contravariant vectors

A contravariant vector is a set of N quantities that transforms like the coordinate differentials. If ∂xi a i T = T (39) ∂xa then T i is a contravariant vector. We use a superscript for the index. The coefficients

∂xi ∂xj

in general depend on the coordinates x i .

The inverse transformation is Ti =

∂xi a T ∂xa

(40)

While the coordinate differentials form a contravariant vector, the coordinates x i do not in general. The tangent vector

4.5

dxi du

to a curve with parameter u is contravariant.

Scalar

A scalar quantity T remains invariant under a coordinate transformation. T (x1 , x2 , . . . xN ) = T (x1 , x2 , . . . xN )

(41)

The functional forms of T and T are different. It is the numerical value at a given point in the space that is unchanged. A function of position is called a field.

4.6

Transformation of the gradient of a scalar field

The gradient of a scalar field is ∂T ∂xi This transforms as follows (again using the chain rule) ∂T ∂xa ∂T = i ∂x ∂xi ∂xa 29

(42)

(43) May 11, 2004 5:16pm

4

TENSOR ALGEBRA

4.7

Covariant vectors

A covariant vector is a set of N quantities that transforms like the gradient of a scalar field. If ∂xa Ta ∂xi

Ti =

(44)

then Ti is a covariant vector. We use a subscript for the index. The inverse transformation is Ti =

4.8

∂xa Ta ∂xi

(45)

Definition of a tensor

A tensor of type (p, q) in a space of dimension N has rank t = p + q and N t components. It is written as i i ···i

Tj11j22···jpq

(46)

The components are labelled by t indices. There are p contravariant and q covariant indices. The general transformation law is i i ···i

T j11 j22 ···jpq =

∂xi1 ∂xi2 ∂xip ∂xb1 ∂xb2 ∂xbq a1 a2 ···ap · · · · · · T ∂xa1 ∂xa2 ∂xap ∂xj1 ∂xj2 ∂xjq b1 b2 ···bq

(47)

The position of the index indicates how the tensor transforms with respect to that index. A subscript implies a covariant index and the transformation is similar to a covariant vector. A superscript implies a contravariant index and the transformation is similar to a contravariant vector. Tensors may be covariant in all indices, contravariant in all indices or mixed. • If Tab is a covariant (0, 2) tensor of rank 2 then it transforms as T ij =

∂xa ∂xb Tab ∂xi ∂xj

(48)

• If Tabc is a covariant (0, 3) tensor of rank 3 then it transforms as T ijk =

∂xa ∂xb ∂xc Tabc ∂xi ∂xj ∂xk

(49)

• If T ab is a contravariant (2, 0) tensor of rank 2 then it transforms as T

ij

=

∂xi ∂xj ab T ∂xa ∂xb 30

(50) May 11, 2004 5:16pm

4

TENSOR ALGEBRA

• If T abc is a contravariant (3, 0) tensor of rank 3 then it transforms as T

ijk

=

∂xi ∂xj ∂xk abc T ∂xa ∂xb ∂xc

(51)

c is a mixed (1, 2) tensor of rank 3 then it transforms as • If Tab k

T ij =

∂xa ∂xb ∂xk c T ∂xi ∂xj ∂xc ab

(52)

A zero rank tensor, known as a scalar, is invariant T = T . A vector is a first rank tensor and may be covariant (0, 1) or contravariant (1, 0).

4.9

Kronecker delta

An isotropic tensor is one whose components are the same in all coordinate systems. The Kronecker delta δij is a mixed (1, 1), isotropic tensor of rank 2. It is known as the fundamental mixed tensor. To show this suppose T ij is a mixed tensor of rank 2 and Tij = δij in coordinate system xi . Then in any other coordinate system j

Ti

= = = = =

∂xj ∂xb ∂xj ∂xb ∂xj ∂xa ∂xj ∂xi δij

∂xa b T ∂xi a ∂xa b δ ∂xi a ∂xa ∂xi

(53)

We can still use δij as the Kronecker delta but its definition only holds in the particular coordinate system being considered. Once we transform to another coordinate system then the components change i.e. it is not isotropic.

4.10

Tensor field

A tensor is a set of quantities defined at a point in the space. As you vary the point, the components of the tensor vary. This gives rise to a tensor field. When we combine tensors, through algebra, all the tensor quantities refer to the same point in the space.

31

May 11, 2004 5:16pm

4

TENSOR ALGEBRA

4.11

Linear combination of tensors

The linear combination of tensors of the same rank and type is a tensor of that rank and type. Rank is the number of indices, type is how the indices are divided between contravariant and covariant. This follows from the affine nature of the transformation. If Rij and Sij are covariant tensors of rank 2 then T ij = α Rij + β Sij , where α and β are real numbers, is a covariant tensor of rank 2. To see this consider how T ij transforms T ij

= α Rij + β S ij = α = =

∂xa

(54)

∂xb

∂xa

∂xb

Rab + β Sab ∂xi ∂xj ∂xi ∂xj ∂xa ∂xb (α Rab + β Sab ) ∂xi ∂xj ∂xa ∂xb Tab ∂xi ∂xj

which is the correct transformation rule.

4.12

Outer product

If T is a tensor of rank t and S is a tensor of rank s then the outer product T S is a tensor of rank t + s. The outer product of two covariant vectors A i and Bi is Tij = Ai Bj , a covariant tensor of rank 2. You can see this from the transformation rule. T ij = Ai B j = = =

∂xa ∂xi ∂xa ∂xi ∂xa ∂xi

Aa

(55) ∂xb ∂xj

Bb

∂xb (Aa Bb ) ∂xj ∂xb Tab ∂xj

which is the correct transformation rule. Similarly Tij = Ai B j is a mixed tensor of rank 2 and T ij = Ai B j is a contravariant tensor of rank 2. The outer product allows the construction of tensors of higher rank from vectors. However, in general, given a tensor of higher rank it is not possible to write it as the outer product of vectors.

4.13

Inner product

If T is a tensor of rank t and S is a tensor of rank s then the inner product T · S is obtained by first taking the outer product and then setting a contravariant index of either S or T 32

May 11, 2004 5:16pm

4

TENSOR ALGEBRA

equal to a covariant index of the other tensor. There is then a summation over that index. This gives a tensor of rank t + s − 2.

The inner product of two vectors Ai and B i is a scalar i.e. T = Ai B i . You can see this from the transformation rule. i

T

= Ai B ∂xi b ∂xa A B = a ∂xi ∂xb a ∂x Aa B b = ∂xb = δba Aa B b

(56)

= Aa B a = T which is the correct transformation rule. Notice that the contraction removes two coefficients by the chain rule. If Aij is a second rank covariant tensor and B i is a contravariant vector then Ci = Aij B j , the inner product is a covariant vector.

4.14

Contraction

If T is a mixed tensor of rank t > 1 then the process of equating a covariant index and a contravariant index (summing over them) is called contraction. This gives a tensor of rank t − 2.

If Tij is a mixed tensor of rank 2 then Tii , the trace, is a scalar. You can see this from the transformation rule. ∂xa ∂xi b T ∂xi ∂xb a ∂xa b = T ∂xb a = δba Tab

Tii =

(57)

= Taa which is the correct transformation rule. Contraction of the fundamental mixed tensor gives the dimension N of the space. N δii = δ11 + · · · + δN =N

(58)

Starting from Tlijk we can form 3 different contravariant tensors of rank 2. Tlljk

and

Tlilk

33

and

Tlijl

(59)

May 11, 2004 5:16pm

4

TENSOR ALGEBRA

4.15

Quotient Rule

Suppose that you have a set of quantities whose inner product with an arbitrary tensor produces a tensor. Then the set of quantities form a tensor of the appropriate rank i.e. if S =T ·X

(60)

where S is a tensor and X is an arbitrary tensor, then T must be a tensor. Suppose Si = Tij X j where Si is a covariant vector and X i is an arbitrary contravariant vector. Then by the quotient rule Tij should be a covariant tensor of rank 2. To confirm this consider how these quantities transform S i = T ij X ∂xa ∂xi

Sa =

∂xa Tab X b = ∂xi

j

(61)

∂xj

T ij X b ∂xb ∂xj T ij X b ∂xb

Therefore "

#

∂xj ∂xa T − T ij X b = 0 ab ∂xi ∂xb

(62)

Now because X b are arbitrary the expression in the square brackets must be zero. ∂xj ∂xa Tab − b T ij = 0 i ∂x ∂x Now take the inner product with

(63)

∂xb ∂xk

∂xa ∂xb Tab = ∂xi ∂xk

∂xj ∂xb T ij ∂xb ∂xk ∂xj = T ij ∂xk = δkj T ij

(64)

= T ik which gives the required transformation formula.

4.16

Tensor equations

Due to the affine (linear and homogeneous) nature of the transformation a tensor whose components are zero in one coordinate system are zero in all coordinate systems. It is a numerically invariant tensor. Any tensor equation can be expressed as: T =0

(65)

34

May 11, 2004 5:16pm

4

TENSOR ALGEBRA

where the left-hand side would be in general a linear combination of outer or inner products of tensors. Since the right-hand side is numerically invariant it follows that the left-hand side is numerically invariant. Therefore the tensor equation is independent of the coordinate system.

4.17

Symmetry and skew-symmetry

Symmetry properties are invariant only if they exist between indices of the same type. If Tij is symmetric ( i.e. Tij = Tji ) then T ij is symmetric. If Tij is skew-symmetric ( i.e. Tij = −Tji ) then T ij is skew-symmetric. To verify this you can consider the transformation rule T ij

∂xa ∂xb Tab ∂xi ∂xj ∂xa ∂xb Tba = ± i ∂x ∂xj ∂xb ∂xa Tba = ± j ∂x ∂xi = ±T ji =

(66) where +/− for symmetry/skew-symmetry

A tensor Tij can always be written as the sum of symmetric and skew-symmetric parts Tij =

1 2

(Tij + Tji ) +

35

1 2

(Tij − Tji )

(67)

May 11, 2004 5:16pm

5

RELATIVE TENSORS

5

Relative tensors

5.1

Transformation rule

A relative tensor transforms in the same way as a tensor except for an outside factor J W where J is the Jacobian ∂xi ∂xj

J = det

!

(68)

and W is known as the weight. A tensor, sometimes known as an absolute tensor, is a relative tensor of weight 0. c is a mixed relative tensor of weight 2 and rank 3, with 2 covariant indices and 1 If Tab contravariant index, then it transforms as k

T ij = J 2

∂xa ∂xb ∂xk c T ∂xi ∂xj ∂xc ab

(69)

A relative tensor of weight 1 is known as a tensor density.

5.2

Summary

We characterise tensor quantities by their behaviour under coordinate transformations. In general they have rank, type and weight. Relative tensors represent the most general form we shall consider. Let i

Jj =

∂xi ∂xj

and

Jji =

∂xi ∂xj

(70)

i

Rank is the number of coefficients Jji or J j which occur in the transformation rule. i

Type refers to the coefficient type where J ji indicates covariant and J j indicates contravariant. The overall type can be covariant (all coefficients are covariant), contravariant or mixed. Weight refers to W in a factor J W in the transformation rule. For any coordinate transformation such that J 6= 0 and J 6= ∞: 1. a relative tensor has rank, type and weight. 2. a tensor density has rank and type with W =1. 3. a tensor has rank and type with W =0. It is only possible to add and subtract relative tensors of the same rank, type and weight.

36

May 11, 2004 5:16pm

5

RELATIVE TENSORS

tensor T S outer product T S inner product T · S contracted T

5.3

rank r1 r2 r 1 + r2 r 1 + r2 − 2 r1 − 2

covariant indices c1 c2 c1 + c 2 c 1 + c2 − 1 c1 − 1

contravariant indices d1 d2 d1 + d 2 d 1 + d2 − 1 d1 − 1

weight w1 w2 w1 + w 2 w 1 + w2 w1

Transformation of determinant

Consider the transformation of gij , a covariant tensor of rank 2, g ij =

∂xa ∂xb gab = Jia Jjb gab ∂xi ∂xj

(71)

Writing this in terms of matrices (g), (J) and (g) gives (g) = (J)T (g)(J)

(72)

where (J)T is the transpose of the matrix (J). Taking determinants we obtain g = J2 g

(73)

Then g is a relative scalar of weight 2. Clearly the sign of g is invariant.

5.4

Transformation of permutation symbol

In a 3-dimensional space (result generalises easily), let u ijk be a covariant relative tensor of weight −1 and rank 3. In coordinate system x i we have uijk = eijk . Then is any other coordinate system ∂xa ∂xb ∂xi ∂xj ∂xa ∂xb = J −1 ∂xi ∂xj −1 = J J eijk

uijk = J −1

∂xc uabc ∂xk ∂xc eabc ∂xk

(74)

= eijk Therefore eijk is an isotropic covariant relative tensor of weight −1 and rank 3. √ If g > 0 and J > 0, it follows that g is a scalar density. Therefore √ εijk = g eijk

(75)

is a covariant tensor of rank 3. Similarly we can show that eijk εijk = √ g

(76)

is a contravariant tensor of rank 3. 37

May 11, 2004 5:16pm

6

RIEMANNIAN SPACE

6

Riemannian space

6.1

Line element

In an N -dimensional space any point is specified by the N coordinates (x 1 , x2 , . . . xN ). If the line element ds between two neighbouring points, x i and xi +dxi , satisfies the quadratic form (ds)2 = gij dxi dxj

(77)

where the gij are functions of the xi , then we have a Riemannian space, denoted R N . Eq.(77) is known as the metric of the space and g ij is called the metric tensor. The metric tensor is positive definite, i.e. g ij xi xj > 0 for all non-zero xi . This is consistent with an interpretation of ds as a distance. A Euclidean space EN is a special type of Riemannian space for which g ij = δij , the Kronecker delta. (ds)2 = δij dxi dxj

(78)

In 3 dimensions ds is the familiar distance between the points. The line element is an invariant. We can use s to parameterise a curve in the space. The tensor nature of gij can be seen by applying the quotient rule to eq.(77). (ds) 2 is an invariant and dxi is an arbitrary contravariant vector. However dx i dxj is not arbitrary due to its symmetry in i and j. Provided we take g ij to be symmetric gij transforms like a covariant tensor of rank 2. The distance between two points s1 and s2 on a curve is Z

s2

ds = s1

Z

u2 u1

s

gij

dxi dxj du du du

(79)

where u parameterises the curve. In general, it is not possible to obtain a transformation to a new coordinate system such that eq.(77) transforms into eq.(78) over the whole space. If this can be done the space is Euclidean. The geometry of a Euclidean space is described as flat. Otherwise the geometry is said to be curved.

6.2

Local Cartesian coordinates

Now consider the metric eq.(77) at a particular point. The metric is a positive definite quadratic form. The theorem of inertia due to Sylvester 7 states that a linear transformation exists that will transform the metric to Euclidean form. Note that it can only be done locally. In the neighbourhood of each point we can obtain local Cartesian coordinates. 7

See Appendix B ‘Odds and ends’ note 7.

38

May 11, 2004 5:16pm

6

RIEMANNIAN SPACE

6.3

Spherical surface in two dimensions

Our starting point is the Cartesian metric in E 3 (ds)2 = (dx)2 + (dy)2 + (dz)2

(80)

Transform this to spherical polar coordinates using x = r sin θ cos φ

(81)

y = r sin θ sin φ z = r cos θ The differentials are dx = (sin θ cos φ) dr + (r cos θ cos φ) dθ − (r sin θ sin φ) dφ

(82)

dy = (sin θ sin φ) dr + (r cos θ sin φ) dθ + (r sin θ cos φ) dφ dz = (cos θ) dr − (r sin θ) dθ

Substitute into eq.(80) to give

(ds)2 = (dr)2 + r 2 (dθ)2 + sin2 θ (dφ)2 Now let r = a, a constant, to obtain

(ds)2 = a2 (dθ)2 + sin2 θ (dφ)2

(83)

(84)

Since we cannot express this as (ds) 2 = (dX)2 + (dY )2 , the spherical surface is not a Euclidean space of 2 dimensions. Euclidean spaces are flat. This result just expresses our understanding that a spherical surface is curved. This is an example of a curved Riemannian space R2 embedded in a Euclidean space E3 of higher dimension.

6.4

Raising and lowering indices

In a Riemannian space gij is the metric tensor, a symmetric covariant tensor of rank 2. The tensor is positive definite i.e. g = det(g ij ) > 0 and the inverse exists. Let g ij be the inverse matrix of gij , then g ij gjk = gkj g ji = δki

(85)

Since gij is symmetric it follows that g ij is also symmetric. If Ak is an arbitrary contravariant vector then B j = gjk Ak is an arbitrary covariant vector. Then g ij Bj

= g ij gjk Ak =

(86)

δki Ak i

= A 39

May 11, 2004 5:16pm

6

RIEMANNIAN SPACE

By the quotient rule it follows that g ij is a contravariant tensor of rank 2. From the isotropy of δki we see that g ij is the inverse of gij in all coordinate systems. g ij is said to be conjugate to gij . By taking the inner product of a tensor with g ij and g ij we can lower and raise indices respectively. Consider Xi a covariant vector. The associated contravariant vector X i is X i = g ij Xj

(87)

We use the same symbol X for the vectors because they refer to the same object and we have a rule for relating the covariant and contravariant components. The rule is consistent because we can lower the index of X i to give gij X j

= gij g jk Xk δik

=

(88)

Xk

= Xi which gives the original covariant vector as it should. When raising and lowering indices it is important to remember the position of the index being moved. This is sometimes done by putting a ‘.’ in the old position of the index. Alternatively the indices are offset. Consider Aij kl . Aij kl = gia Aaj kl

lower index i

Aijkl = gja Aiakl

lower index j

Aijkl = g ka Aij al

raise index k

Aijkl = g ka g lb Aij ab

raise indices k and l

Aij kl , Aijkl , Aijkl and Aijkl are all said to be tensors associated with A ij kl . Suppose Rij is a symmetric covariant tensor. Then the associated mixed tensor is given by Rij

= g ia Raj

(89)

ia

= g Rja = Rj i = Rji i.e. we do not need to worry about the position of the raised index. This depends on the symmetry of the tensor. However even though the covariant tensor is symmetric it does not imply symmetry in the mixed tensor e.g. in 2 dimensions R21 = g 1a Ra2 = g 11 R12 + g 12 R22 R12 = g 2a Ra1 = g 21 R11 + g 22 R21 40

(90) May 11, 2004 5:16pm

6

RIEMANNIAN SPACE

which are clearly not the same. The associated contravariant tensor is symmetric since Rij

= g ia g jb Rab

(91)

ia jb

= g g Rba = Rji

6.5

Length and direction of a vector

The length squared of a contravariant vector A i is given by a2 = gij Ai Aj i

= gij A g = = =

jk

(92) Ak

lowering index

δik Ai Ak A i Ai g ij Ai Aj

(93) (94)

If gij is positive definite then a2 > 0 for non-zero vectors and we can take the square root to obtain a real length. The angle θ between two vectors Ai and B i of lengths a and b respectively is given by the scalar product ab cos θ = gij Ai B j

(95)

i

= A Bi

(96)

i

(97)

= Ai B ij

= g Ai Bj

(98)

We require | cos θ| ≤ 1 so that the concept of angle (direction) is meaningful. Let X i = a1 Ai and Y i = 1b B i so that they are unit vectors. Then cos θ = gij X i Y j

(99)

Consider Z i = X i + kY i where k is a real number. gij Z i Z j = gij (X i + kY i ) (X j + kY j ) 2

i

= 1 + k + 2k gij X Y =

≥ 0

2

+ 1 − gij X i Y j

2

≥0

k + gij X i Y j

(100)

j

due to positive definite

2

This holds for any k. Let k = −gij X i Y j

1 − gij X i Y j 41

(101) May 11, 2004 5:16pm

6

RIEMANNIAN SPACE

giving gij X i Y j ≤ 1

−→

| cos θ| ≤ 1

(102)

Two vectors Ai and B i are orthogonal if gij Ai B j = 0. If we parameterise a curve by the distance s then the tangent vector (ds)2 = gij dxi dxj

6.6

−→

1 = gij

dxi ds

is a unit vector.

dxi dxj ds ds

(103)

Geodesics

A geodesic is a generalisation of the straightline in a Euclidean space. A straightline is the shortest distance between two points and this suggests a variational definition of geodesic through the Euler-Lagrange equations. We need to define distance of course and this is done through the metric in a Riemannian space. A curve in RN can be represented by the equation xi = xi (u) and we define x˙ i =

(104)

dxi du .

From the calculus of variations we find that for fixed endpoints u 1 and u2 ∆

Z

u2

L(xi , x˙ i , u) du = 0

(105)

u1

when the Euler-Lagrange equations are satisfied d du

∂L ∂ x˙ i

−

∂L =0 ∂xi

(106)

L is the Lagrangian function. ∆ denotes weak variations in the integral so that we are requiring the integral to be stationary for small variations in the curve. The distance along a curve joining two points is given by I=

Z

u2 u1

ds du du

(107)

This curve is a geodesic if I is stationary for small variations in the curve. Thus the geodesic is a solution of the Euler-Lagrange equations with L = =

ds du q

(108)

gij x˙ i x˙ j

42

May 11, 2004 5:16pm

6

RIEMANNIAN SPACE

We can now apply the Euler-Lagrange equations eq.(106). However the algebra is complicated by the presence of the square root. Due to the positive definite nature of the metric we get the same result by considering L = gij x˙ i x˙ j

(109)

We need the following ∂gij i j ∂L = x˙ x˙ k ∂x ∂xk

and

∂L = 2gik x˙ i ∂ x˙ k

(110)

Therefore d du

∂L ∂ x˙ k

d 2gik x˙ i du ∂gik j i i x˙ x˙ + gik x ¨ = 2 ∂xj ∂gik j i ∂gjk i j i x˙ x˙ + x˙ x˙ + 2 gik x ¨ = ∂xj ∂xi

=

(111)

where we have used (remember i and j are dummy indices) ∂gik j i ∂gjk i j x˙ x˙ = x˙ x˙ ∂xj ∂xi

(112)

Eq.(106) gives 2 gik x ¨i +

∂gik j i ∂gjk i j ∂gij i j x˙ x˙ + x˙ x˙ = x˙ x˙ ∂xj ∂xi ∂xk

(113)

Rearrange as 1 gik x ¨ + 2 i

∂gjk ∂gij ∂gik + − j i ∂x ∂x ∂xk

x˙ i x˙ j = 0

(114)

Now take the inner product with g mk to get 1 x ¨ + g mk 2 m

The Christoffel

8

∂gik ∂gjk ∂gij + − j i ∂x ∂x ∂xk

x˙ i x˙ j = 0

(115)

symbol of the first kind is defined to be [ij, k] =

1 2

∂gjk ∂gik ∂gij + − j i ∂x ∂x ∂xk

(116)

The Christoffel symbol of the second kind is defined to be (

m i j

)

= g mk [ij, k]

(117)

Both of the symbols are symmetric in the indices i and j. Using these symbols eq.(115) can be written as m

x ¨ + 8

(

m i j

)

x˙ i x˙ j = 0

(118)

See Appendix B ‘Odds and ends’ note 8.

43

May 11, 2004 5:16pm

6

RIEMANNIAN SPACE

6.7

Geodesics on the surface of a sphere

From eq.(84) we have

(ds)2 = a2 (dθ)2 + sin2 θ (dφ)2 Let x1 = θ and x2 = φ and then we obtain gij =

a2 0 2 0 a sin2 θ

!

(119)

(120)

We will obtain the geodesic equations directly from the Euler-Lagrange equations eq.(106). The Lagrangian is L =

ds du

2

(121)

= a2 θ˙ 2 + sin2 θ φ˙ 2 We need the following partial derivatives ∂L = 2 a2 θ˙ and ˙ ∂θ and ∂L = 2 a2 sin2 θ φ˙ ∂ φ˙

∂L = 2 a2 sin θ cos θ φ˙ 2 ∂θ ∂L =0 ∂φ

and

The Euler-Lagrange equation for θ is d ∂L ∂L = du ∂ θ˙ ∂θ d 2 ˙ 2 a θ = 2 a2 sin θ cos θ φ˙ 2 du This gives θ¨ = sin θ cos θ φ˙ 2

(122)

(123)

(124)

(125)

The Euler-Lagrange equation for φ is d du

∂L ∂ φ˙

!

=

∂L ∂φ

(126)

d 2 2 a sin2 θ φ˙ = 0 du

This gives sin2 θ φ˙ = C

where

C is a constant (127) A solution of this would be to choose φ = φ o , a constant. Then φ˙ = 0. Eq.(125) becomes θ¨ = 0 (128) with solution θ = αu + β where α and β are constants. This solution corresponds to a great circle passing through the poles. An alternative method to obtain the geodesic equations is to evaluate the Christoffel symbols directly from eq.(116) and eq.(117) and use eq.(118). 44

May 11, 2004 5:16pm

6

RIEMANNIAN SPACE

6.8

Christoffel symbols from the geodesic equations

You can evaluate the Christoffel symbols directly from eq.(116) and eq.(117) for a given metric. An alternative procedure is to derive the Euler-Lagrange equations starting from eq.(106) and then by comparison with eq.(118) you can write down the required symbols. As an example consider eq.(125) and eq.(127). For θ we have (

θ θ θ

)

=0

(

θ θ φ

)

(

=0

)

θ φ φ

= − sin θ cos θ

(129)

For φ we must differentiate eq.(127) sin2 θ φ¨ + 2 sin θ cos θ θ˙ φ˙ = 0

(130)

and after dividing across by sin2 θ obtain (beware double counting in eq.(118)) (

φ θ θ

)

=0

(

φ θ φ

)

(

cos θ = sin θ

φ φ φ

)

=0

(131)

To obtain the symbols of the first kind use [ij, k] = gmk

(

m i j

)

(132)

In our example the metric is diagonal which simplifies the calculation. [θθ, θ] = gθθ

(

θ θ θ

)

=0

[θφ, θ] = gθθ

(

θ θ φ

)

=0

[φφ, θ] = gθθ

(

θ φ φ

)

= −a2 sin θ cos θ

[θθ, φ] = gφφ

(

φ θ θ

)

=0

[θφ, φ] = gφφ

(

φ θ φ

)

= a2 sin θ cos θ

[φφ, φ] = gφφ

(

φ φ φ

)

=0

45

(133)

May 11, 2004 5:16pm

7

TENSOR CALCULUS

7

Tensor calculus

So far we have considered a tensor defined only at a single point of the space. Comparing a tensor at neighbouring points of a space introduces the idea of differentiation.

7.1

Gradient of a scalar field

If T = T (x1 , x2 , . . . xN ) is a scalar field then ∂k T =

∂T ∂xk

(134)

is a covariant vector. This was our prototype covariant vector. It is known as the gradient of T .

7.2

Covariant derivative of a covariant vector

Let Ti be a covariant vector field then ∂k Ti is not a tensor. ∂kT i = = =

∂T i ∂xk ∂ ∂xa T a ∂xk ∂xi ∂ 2 xa ∂xa ∂xb T + ∂b Ta a ∂xi ∂xk ∂xk ∂xi

(135)

Due to the first term on the rhs ∂k Ti is not a tensor. This term involves a second derivative. In general this is non-zero. However if we only allowed linear coordinate transformations it would be zero. This is the case for the theory of Cartesian tensors. Ordinary differentiation does not yield a tensor. Let the covariant vectors T i and Ti + dTi be associated with neighbouring points x a and xa + dxa such that Ti (xa + dxa ) = Ti (xa ) + dTi (xa )

(136)

dTi (xa ) = Ti (xa + dxa ) − Ti (xa )

(137)

or

But Ti (xa + dxa ) and Ti (xa ) transform in different ways since the coefficients of transformation depend on position. Therefore dT i is not a vector and since dTi = (∂k Ti ) dxk

(138)

it follows that ∂k Ti is not a tensor. We seek an operator that does yield a tensor. It will be noted by ∇ k and be such that ∇k T i =

∂xa ∂xb ∇b Ta ∂xi ∂xk 46

(139) May 11, 2004 5:16pm

7

TENSOR CALCULUS

This implies introducing a new differential such that DTi = (∇k Ti ) dxk

(140)

To obtain a tensor we introduce the concept of parallel displacement 9 . We wish to generalise the idea that in a Euclidean space a vector T i can be displaced to a neighbouring point without changing its magnitude or direction. The components of the displaced vector are denoted by Ti + δTi such that ∗

Ti (xa + dxa ) = Ti (xa ) + δTi (xa )

(141)

With Cartesian coordinates δTi = 0. However if we are using curvilinear coordinates, δT i do not vanish since the curvilinear axes change direction. Consider Ti (xa + dxa ) −∗ Ti (xa + dxa ) = (Ti + dTi ) − (Ti + δTi )

(142)

= dTi − δTi = DTi

= (∇k Ti ) dxk where all quantities on the rhs are evaluated at x a . The lhs is a vector since Ti and ∗ Ti are both vectors defined at the same point x a + dxa . By the quotient rule, since dxk is arbitrary, ∇k Ti is a covariant tensor of rank 2. It is the covariant derivative of T i . We still need to specify the parallel displacement of a vector. Let δTi = Γaik Ta dxk

(143)

This defines a 1-1 mapping of the vectors at neighbouring points onto each other. A linear transformation of coordinates in geometry is called an affine transformation. The set of quantities Γaik is known as an affinity 10 and defines an affine connection. The affinity depends on position xa . A space for which an affinity is defined is known as an affinely connected space. It follows that (∇k Ti ) dxk = DTi

(144)

= dTi − δTi

= (∂k Ti ) dxk − Γaik Ta dxk = (∂k Ti − Γaik Ta ) dxk

Since dxk is arbitrary ∇k Ti = ∂k Ti − Γaik Ta

(145)

9 Also known as vector transplantation, parallel transport, parallel transfer. See Appendix B ‘Odds and ends’ note 9. 10 Affinitas is a latin word meaning neighbourhood or relationship by marriage.

47

May 11, 2004 5:16pm

7

TENSOR CALCULUS

7.3

Affinities

We would not expect an affinity to transform like a tensor since it compensates for the non-tensorial transformation of ∂ k Ti . This can be shown by considering the transformation of the covariant derivative. From eq.(145) a

∇k T i = ∂ k T i − Γik T a

(146)

The lhs is from eq.(139) ∂xa ∂xi ∂xa ∂xi

∇k T i = =

∂xb ∇b Ta ∂xk ∂xb [∂b Ta − Γcab Tc ] ∂xk

(147)

The rhs is from eq.(135) c ∂xa ∂xb ∂ 2 xc a ∂x T + ∂ T − Γ Tc (148) c a b ik ∂xi ∂xk ∂xa ∂xk ∂xi Equating the two expressions and using the fact that T c is an arbitrary vector gives a

∂ k T i − Γik T a =

a

Γik

∂ 2 xc ∂xc ∂xa ∂xb c Γ + = ab a i ∂x ∂x ∂xk ∂xk ∂xi

(149)

which is an expression for the second derivative. Take the inner product with j

Γik =

∂xj ∂xc

to obtain

∂xj ∂xa ∂xb c ∂xj ∂ 2 xc Γ + ab i k ∂xc ∂x ∂x ∂xc ∂xk ∂xi

(150)

∂xi ∂ 2 xc ∂xi ∂xb ∂xc a Γ + ∂xa ∂xj ∂xk bc ∂xc ∂xk ∂xj

(151)

Interchange indices to obtain i

Γjk = Starting from

0 = = =

∂ i δ ∂xk j ! ∂ ∂xi ∂xk ∂xj ∂ ∂xk

∂xi ∂xc ∂xc ∂xj

(152)

!

you can show that ∂xi ∂ 2 xc ∂ 2 xi ∂xa ∂xb =0 + ∂xc ∂xk ∂xj ∂xa ∂xb ∂xk ∂xj so that the transformation rule for an affinity takes the form i

Γjk =

∂xi ∂xb ∂xc a ∂ 2 xi ∂xa ∂xb Γ − ∂xa ∂xj ∂xk bc ∂xa ∂xb ∂xk ∂xj 48

(153)

(154) May 11, 2004 5:16pm

7

TENSOR CALCULUS

Point 1 An affinity is any set of functions that satisfy the transformation law eq.(151). ˆ i = Γi is also an affinity. This follows from the Point 2 If Γijk is an affinity then Γ jk kj symmetry of the second derivative in eq.(151). ˆ i are affinities, then Γi − Γ ˆ i is a type (1, 2) tensor. This follows from Point 3 If Γijk and Γ jk jk jk the cancellation of the second derivative terms in the transformation law eq.(151). Point 3 The torsion tensor i Sjk = Γijk − Γikj

(155)

is a skew-symmetric type (1, 2) tensor. It is twice the skew-symmetric part of the affinity i.e. Γijk = 21 (Γijk + Γikj ) + 21 (Γijk − Γikj )

(156)

i = 0 and the affinity is symmetric. In a torsionless space we have Sjk

Point 4 If an affinity is symmetric (with respect to its subscripts) then it is symmetric in all coordinate systems. This can be shown directly using the transformation law i = 0. Since S i is a tensor, it eq.(151). Alternatively if Γijk is symmetric then Sjk jk i

i

follows that S jk = 0 and hence that Γjk is symmetric. Point 5 If the affinity is zero everywhere for coordinates x i , then for another set of coordinates xi we have i

Γjk =

∂xi ∂ 2 xc ∂xc ∂xk ∂xj

(157)

The affinity is clearly symmetric in j and k. If the coordinate transformation is linear i then the second derivative vanishes and Γ jk = 0. Point 6 If aij is a symmetric (0, 2) tensor that is everywhere invertible then the inverse a ij is a symmetric (2, 0) tensor. The Christoffel symbol of the second kind with respect to aij is a symmetric affinity for the space. Γijk =

1 2

aim (∂j amk + ∂k ajm − ∂m ajk )

(158)

Using this affinity you find that ∇k aij = 0

∇k aij = 0

and

(159)

In a Riemannian space the natural choice of affinity is the Christoffel symbol of the second kind with respect to gij , the metric tensor. We shall mainly consider symmetric affinities

11

.

11 The more general idea of unsymmetric affinities was introduced into General Relativity in an effort to unify gravitation and electromagnetism. It introduces more degrees of freedom into the space.

49

May 11, 2004 5:16pm

7

TENSOR CALCULUS

7.4

Rules for parallel displacement

The covariant derivative of tensors can be obtained by applying the following rules for parallel displacement: Rule 1 The parallel displacement of a scalar field is zero. Rule 2 The parallel displacement of a product of tensors obeys the product rule of ordinary differentials i.e. δ(ABC . . .) = (δA)BC . . . + A(δB)C . . . + AB(δC) . . .

(160)

By Rule 1 if T is a scalar field then δT = 0. It follows that (∇k T ) dxk = DT

(161)

= dT − δT = dT

= (∂k T ) dxk and the covariant derivative agrees with the ordinary derivative which we know already is a covariant vector.

7.5

Covariant derivative of a contravariant vector

We can apply these rules to determine the covariant derivative of a contravariant vector T i . Form the scalar T i Si where Si is an arbitrary covariant vector. Then δ(T i Si ) = 0

Rule 1 i

i

i

i

= (δT ) Si + T (δSi ) = = =

(162)

Rule 2

(δT ) Si + T Γaik dxk Sa (δT i ) Si + T a Γiak dxk Si (δT i + T a Γiak dxk ) Si

from eq.(143) swap dummy indices i and a

Since Si is arbitrary it follows that δT i = −Γiak T a dxk

(163)

The covariant derivative of T i is defined as (∇k T i ) dxk = DT i i

= dT − δT

(164) i

= (∂k T i + Γiak T a ) dxk

and since dxk is arbitrary it follows that ∇k T i = ∂k T i + Γiak T a 50

(165) May 11, 2004 5:16pm

7

TENSOR CALCULUS

7.6

Covariant derivative of tensors

The covariant derivative of tensors of higher rank can be defined using the rules for parallel displacement. First of all we form a scalar by inner products with a number of arbitrary vectors. Then the product rule is applied and terms are gathered together. Some results are ∇k Tij = ∂k Tij − Γaik Taj − Γajk Tia

(166)

∇k Tji = ∂k Tji + Γiak Tja − Γajk Tai

(167)

∇k T ij = ∂k T ij + Γiak T aj + Γjak T ia

(168)

If the tensor has rank t then the covariant derivative consists of the ordinary derivative plus t terms, one for each index. If the index i is covariant then the term is the coefficient −Γ aik multiplied into the tensor with index i replaced by a i.e. a summation over this index. If the index i is contravariant then the term is the coefficient Γ iak multiplied into the tensor with index i replaced by a i.e. a summation over this index. The index k in these coefficients is the index of differentiation.

7.7

Covariant derivative of fundamental tensor

The covariant derivative of the fundamental tensor δ ji is zero. From eq.(167) ∇k δji

= ∂k δji + Γiak δja − Γajk δai

(169)

= Γijk − Γijk = 0

7.8

Product rule for covariant differentiation

Since the product rule applies to parallel displacement it follows that the product rule also applies to covariant differentiation.

∇k Sji T i = T i ∇k Sji + Sji ∇k T i

7.9

(170)

Curvature tensor

Covariant differentiation is not commutative i.e. ∇k ∇j T i 6= ∇j ∇k T i 51

(171) May 11, 2004 5:16pm

7

TENSOR CALCULUS

Since ∇j T i is a mixed tensor of rank 2 its covariant derivative is given by eq.(167) ∇k ∇j T i = ∂k ∇j T i + Γiak ∇j T a − Γajk ∇a T i =

=

(172)

i

∂k (∂j T + Γiaj T a ) + Γiak (∂j T a + Γabj T b ) − Γajk ∇a T i ∂k ∂j T i + (∂k Γiaj ) T a + Γiaj ∂k T a + Γiak ∂j T a + Γiak Γabj

T b − Γajk ∇a T i

Now interchange j and k to obtain ∇j ∇k T i = ∂j ∂k T i + (∂j Γiak ) T a + Γiak ∂j T a + Γiaj ∂k T a + Γiaj Γabk T b − Γakj ∇a T i

(173)

It follows on subtraction, since partial differentiation commutes and the 3rd and 4th terms on the rhs cancel, that a ∇k ∇j T i − ∇j ∇k T i = Ribkj T b − Sjk ∇a T i

(174)

Ribkj = ∂k Γibj − ∂j Γibk + Γabj Γiak − Γabk Γiaj

(175)

i = Γijk − Γikj Sjk

(176)

where

and

is the torsion tensor. Just rearranging the indices in eq.(175) gives Rijkl = ∂k Γijl − ∂l Γijk + Γajl Γiak − Γajk Γial

(177)

The non-commutative nature of covariant differentiation depends on the tensor R ijkl which is known as the curvature tensor. It is a mixed tensor of rank 4. This follows from the quotient rule since T i is an arbitrary vector. It is clear from the definition eq.(177) that the curvature tensor R ijkl is skew-symmetric in the indices k and l. If the curvature tensor R ijkl is contracted with respect to the indices i and l we obtain the Ricci tensor 12 given by Rjk = Rijki

(178)

= ∂k Γiji − ∂i Γijk + Γaji Γiak − Γajk Γiai We can also contract with respect to the indices i and j to obtain Riikl = ∂k Γiil − ∂l Γiik

(179)

which is a skew-symmetric tensor. 12

See Appendix B ‘Odds and ends’ note 10.

52

May 11, 2004 5:16pm

7

TENSOR CALCULUS

7.10

Symmetric affinity

When the affinity is symmetric we obtain ∇k ∇j T i − ∇j ∇k T i = Ribkj T b

(180)

Even if the affinity is symmetric the Ricci tensor is not due to the term ∂ k Γiji . If we can write Γiji as the derivative of a function of the coordinates i.e. Γiji =

∂f ∂xj

(181)

then the term is symmetric in j and k. It follows that the Ricci tensor is symmetric. Also Riikl = 0

(182)

This can be done for the metric affinity of a Riemannian space.

53

May 11, 2004 5:16pm

8

METRIC AFFINITY

8

Metric affinity

8.1

Ricci theorem

In a Riemannian space we impose the condition that the scalar product of two vectors is invariant under parallel displacement. It follows that the length and direction of a vector will remain unchanged. In this way the affinity is determined by the metric. Consider the arbitrary vector fields A i and B i which when parallel displaced give the fields (∗ Ai ) and (∗ B i ) respectively. Let V = gij (∗ Ai ) (∗ B j )

(183)

This is a scalar quantity. At the point x a + dxa we have V (xa + dxa ) = gij (xa + dxa ) (∗ Ai )(xa + dxa ) (∗ B j )(xa + dxa )

(184)

gij (xa + dxa ) = gij (xa ) + dgij (xa )

(185)

Ai (xa + dxa ) = Ai (xa ) + δAi (xa )

(186)

B i (xa + dxa ) = B i (xa ) + δB i (xa )

(187)

Using

∗

∗

gives V (xa + dxa ) = (gij + dgij ) (Ai + δAi ) (B j + δB j ) i

j

i

(188)

j

i

j

= V + dgij A B + gij (δA ) B + gij A (δB ) to first order in the differentials. All quantities on the rhs are evaluated at x a . Our condition then gives that dV

= V (xa + dxa ) − V (xa ) i

j

i

(189) j

i

j

= dgij A B + gij (δA ) B + gij A (δB ) = 0

Also since V is a scalar field it follows that δV

= δgij Ai B j + gij (δAi ) B j + gij Ai (δB j )

(190)

= 0 and since the vectors are arbitrary we must have dg ij = δgij which implies that the covariant derivative is zero. Thus in a Riemannian space the affinity is chosen so that ∇k gij = 0 54

(191) May 11, 2004 5:16pm

8

METRIC AFFINITY

This is known as Ricci’s theorem. Also ∇k δji

= ∇k (g ia gja )

(192)

ia

= (∇k g )gja = 0

and it follows by contraction with g jb that ∇k g ib = 0. An important consequence of this is that when the covariant derivative of any expression is taken gij or g ij can be treated as a constant e.g. ∇k Ai = ∇k (gia Aa ) = gia ∇k A

(193)

a

so that the lowering of an index can take place either before or after the covariant differentiation. In addition we can define ∇i = g ij ∇j without ambiguity. The divergence of a vector field can be written in two ways, as ∇i Ai or ∇i Ai .

8.2

Formula for the metric affinity

We can obtain an expression for the affinity by considering ∇k gij

= ∂k gij − Γaik gaj − Γajk gia = 0

∇i gjk = ∂i gjk −

∇j gki = ∂j gki −

Γaji gak Γakj gai

−

−

Γaki gja Γaij gka

(194)

=0 =0

and adding the first two equations and subtracting the third gives the following, where we take the affinity to be symmetric and remember that g ij is also symmetric: 2 Γaki gaj

= ∂k gij + ∂i gjk − ∂j gki

(195)

= 2[ki, j]

Now take the inner product with g jm Γaki gaj g jm = g jm [ki, j] Γaki δam Γm ki Γm ki

= g

jm

[ki, j]

= g

jm

[ki, j]

=

(

m k i

(196)

)

The metric affinity is given by the Christoffel symbol of the second kind.

55

May 11, 2004 5:16pm

8

METRIC AFFINITY

8.3

Condition for a flat space

If the metric is constant throughout the space then we have a Euclidean space and since ∂k gij = 0 we have Γijk = 0 and Rijkl = 0. It can be shown that if R ijkl = 0 we can find a coordinate system in which the components of the metric are constant throughout the space. Thus Rijkl = 0 is a necessary and sufficient condition for a Euclidean (flat) space.

56

May 11, 2004 5:16pm

9

CONSTANT VECTOR FIELDS AND GEODESICS

9 9.1

Constant vector fields and geodesics Constant vector fields

The notion of parallel displacement can be introduced by considering what is meant by a constant vector field. In a Euclidean space using Cartesian coordinates two vectors at two different points are said to be equal if their components are equal. A vector field is constant if the vector components are constant throughout the space. This definition of a constant vector does not hold in a Riemannian space. The length of a vector T i depends on the metric q

gij T i T j

(197)

and the metric tensor depends on position in the space. Therefore the requirement that the length of a vector be constant amounts to requiring that the metric is constant throughout the space. Such a metric can be brought to Euclidean form by a coordinate transformation. However not all Riemannian spaces are Euclidean. From the transformation rule i

T =

∂xi j T ∂xj

(198)

and since the transformation coefficients are arbitrary, it follows that constancy of components in one coordinate system does not imply constancy in another. Thus constancy of components is not coordinate invariant and in general is not a useful concept. Suppose T i is a vector field with constant components in some coordinate system. Then i consider a curve in the space which is parameterised by a scalar u. Then dT du = 0. Next consider how this transforms. i

dT du

=

d du

∂xi b T ∂xb

=

d du

∂xi ∂xb

!

!

(199)

Tb

=

∂ 2 xi dxa ∂xa ∂xb du

!

=

∂ 2 xi dxa ∂xa ∂xb du

!

=

∂ 2 xi ∂xa dxk ∂xa ∂xb ∂xk du

!

=

∂ 2 xi ∂xa ∂xb ∂xa ∂xb ∂xk ∂xj

!

i

= −Γjk T

j

Tb ∂xb j T ∂xj

!

∂xb j T ∂xj T

j

!

dxk du

dxk du 57

May 11, 2004 5:16pm

9

CONSTANT VECTOR FIELDS AND GEODESICS

where i

Γjk = −

∂ 2 xi ∂xa ∂xb ∂xa ∂xb ∂xk ∂xj

(200)

We have written this as an affinity because it agrees with the transformation law eq.(154) when the affinity is zero in the unbarred coordinates. This affinity is symmetric in j and k. This expression suggests how constancy of components can be carried over to the general space. We define parallel displacement of a vector by δT i = −Γijk T j dxk

(201)

This law allows us to define a vector field ( ∗ T i ) by parallel displacement. Thus starting with T i at xa we have (∗ T i ) (xa + dxa ) = T i + δT i i

= T −

Γijk

(202) j

T dx

k

where quantities on the rhs are evaluated at x a . We can set up a differential equation for the vector field ( ∗ T i ) in the parameter u. It is given by d(∗ T i ) dxk = −Γijk (∗ T j ) du du

(203)

This equation allows us to compute the parallel displaced vector along a given curve from some known starting point. In general the result of parallel displacement depends on the path taken.

9.2

Geodesics

A geodesic is a generalisation of the straightline in a Euclidean space. A straightline is the shortest distance between two points and this suggests a variational definition of geodesic through the Euler-Lagrange equations. We need to define distance and this is done through the metric in a Riemannian space. An alternative approach is to note that in a Euclidean space the tangent vector to a straightline remains parallel to itself when displaced along the line i.e. the tangent vector to the line is the same at all points on the line. We will generalise this to a geodesic and require that k the tangent vector to a curve e.g. dx du is the same as the parallel displaced tangent vector k (∗ dx du ). This requires that the tangent vector satisfies the differential equation eq.(203) and we obtain j k d2 xi i dx dx + Γ =0 jk du2 du du

(204)

A curve which satisfies this equation is a geodesic. A parameter u for which the form of this equation is unchanged is called an affine parameter. The relationship between affine parameters is linear. 58

May 11, 2004 5:16pm

9

CONSTANT VECTOR FIELDS AND GEODESICS

This result agrees with the Euler-Lagrange equations eq.(118) provided we choose the affinity in a Riemannian space to be the Christoffel symbol of the second kind Γijk

=

(

i j k

59

)

(205)

May 11, 2004 5:16pm

10

10

COVARIANT DERIVATIVE OF RELATIVE TENSORS

Covariant derivative of relative tensors

We wish to define the covariant derivative of relative tensors. As a first step consider the partial derivative of a relative scalar T of weight W . Since T = JW T

(206)

then the partial derivative of T transforms as = (∂ k J W )T + J W ∂ k T ∂xa = (∂ k J W )T + J W ∂a T ∂xk

∂kT

(207)

The first term on the rhs involves ∂ k J W = W J W −1 ∂ k J

(208)

To continue we need to evaluate the derivative of the Jacobian J. Since J = det(Jji ) where Jji = ∂xi /∂xj then if Cij is the cofactor of Jji we have Jji Clj = J δli

(209)

∂xi ∂xj = δli ∂xj ∂xl

(210)

Also

so that Clj = J

∂xj ∂xl

(211)

which is just the relation between the cofactor and the elements of the matrix inverse. Now ∂kJ

= Cij ∂ k Jji = J

∂xj

(212) ∂ 2 xi

∂xi ∂xk ∂xj

Therefore the transformation of the partial derivative of T becomes ∂kT = W J W

a ∂xj ∂ 2 xi W ∂x T + J ∂a T ∂xi ∂xk ∂xj ∂xk

(213)

The first term on the rhs contains a second derivative. This second derivative can be replaced by affinities. The transformation law for an affinity is i

Γjk =

∂xi ∂ 2 xa ∂xi ∂xb ∂xc a + Γ ∂xa ∂xk ∂xj ∂xa ∂xj ∂xk bc 60

(214) May 11, 2004 5:16pm

10

COVARIANT DERIVATIVE OF RELATIVE TENSORS

Contracting gives ∂xi ∂ 2 xa ∂xi ∂xb ∂xc a + Γ ∂xa ∂xk ∂xi ∂xa ∂xi ∂xk bc c ∂xi ∂ 2 xa b ∂x + δ Γa a ∂xa ∂xk ∂xi ∂xk bc ∂xc a ∂xi ∂ 2 xa + Γ ∂xa ∂xk ∂xi ∂xk ac

i

Γik = = =

(215)

Therefore ∂xi ∂ 2 xa ∂xc a i Γ − = Γ ik ∂xa ∂xk ∂xi ∂xk ac

(216)

and the transformation of the derivative of T becomes

∂kT = W J W

i

Γik −

∂xc a ∂xa Γac T + J W ∂a T k ∂x ∂xk

(217)

Rearrange as i

∂ k T − W J W Γik T = J W

c ∂xa W ∂x ∂ T − W J Γa T a ∂xk ∂xk ac

(218)

or i

∂ k T − W Γik T = J W

i ∂xa h b ∂ T − W Γ T a ba ∂xk

(219)

We define the covariant derivative of the relative scalar of weight W to be ∇k T = ∂k T − W Γiik T

(220)

∇k T is a relative covariant vector of weight W .

··· This can be generalised to any relative tensor T ji11··· of weight W . The covariant derivative is the covariant derivative of the corresponding absolute tensor plus a term involving W and the contracted affinity i.e. i1 i2 ··· ··· ··· ··· 1 Tjai1 j22······ + · · · − Γaj1 k Taj − · · · − W Γaak Tji11ji22··· ∇k Tji11ji22··· = ∂k Tji11ji22··· + Γiak 2 ···

(221)

If T i is a vector density ( i.e. W = 1) then ∇k T i = ∂k T i + Γiak T a − Γaak T i

61

(222)

May 11, 2004 5:16pm

10

COVARIANT DERIVATIVE OF RELATIVE TENSORS

The divergence of T i is ∇i T i = ∂i T i + Γiai T a − Γaai T i i

= ∂i T +

= ∂i T i +

(223)

Γaia T i − Γaai T i [Γaia − Γaai ] T i

If the affinity is symmetric then ∇i T i = ∂ i T i

(224)

If T ij is a skew-symmetric, tensor density ( i.e. W = 1) then ∇k T ij = ∂k T ij + Γiak T aj + Γjak T ia − Γaak T ij

(225)

The divergence of T ij is ∇j T ij = ∂j T ij + Γiaj T aj + Γjaj T ia − Γaaj T ij

(226)

= ∂j T ij + Γiaj T aj + Γaja T ij − Γaaj T ij h

i

= ∂j T ij + Γiaj T aj + Γaja − Γaaj T ij

If the affinity is symmetric then ∇j T ij = ∂j T ij

62

(227)

May 11, 2004 5:16pm

11

PROPERTIES OF THE CURVATURE TENSOR

11 11.1

Properties of the curvature tensor Geodesic coordinates

When the affinity is symmetric it is possible to find a coordinate system in which the affinity is zero at a particular point. This greatly facilitates the proof of tensor identities. These must hold in all coordinate systems and we are free to choose the most convenient for the proof. Let P be any point. We choose the coordinates such that P is the origin i.e. at P x i = 0. Consider the transformation xi = xi +

1 2

λijk xj xk

(228)

where we choose the constants λijk to be symmetric in j and k. P corresponds to x i = 0. Then we have ∂xi = δji + λijk xk ∂xj

∂ 2 xi = λijk ∂xj ∂xk

and

(229)

At the point P these become ∂xi = δji ∂xj

and

∂ 2 xi = λijk ∂xj ∂xk

(230)

∂xi = δki ∂xk

(231)

Also at P since ∂xi ∂xj = δki ∂xj ∂xk

−→

We now use these formulae in the transformation rule for an affinity eq.(151) i

Γjk =

∂xi ∂xb ∂xc a ∂xi ∂ 2 xa + Γ ∂xa ∂xj ∂xk ∂xa ∂xj ∂xk bc

(232)

At the point P we have i

Γjk = δai λajk + δai δjb δkc Γabc

(233)

which reduces to i

Γjk = λijk + Γijk

(234)

λijk = −Γijk

(235)

Then if we choose

we have i

Γjk = 0

(236)

63

May 11, 2004 5:16pm

11

PROPERTIES OF THE CURVATURE TENSOR

at the point P. Thus we can find a coordinate system, known as the geodesic coordinate system, in which the affinity is zero at the point P. The affinity must be symmetric since λ ijk is symmetric. While the affinity is zero, the derivative of the affinity at P is not zero. In general any neighbouring point will have a non-zero affinity and the derivative is obtained by comparing neighbouring points. Consider the metric affinity used in a Riemannian space. Since by Ricci’s theorem ∇k gij

= ∂k gij − Γaik gaj − Γajk gia

(237)

= 0

it follows that ∂k gij = 0 when we use geodesic coordinates. By a similar argument ∂ k g ij = 0. This implies that the metric is constant, geodesic coordinates correspond to local Euclidean coordinates. However it may not be true that the second derivatives of the metric tensor are zero.

11.2

Bianchi identities

For a symmetric affinity we can use geodesic coordinates to prove the Bianchi identities

13

.

The Bianchi identities involve covariant derivatives of the curvature tensor. They are given by ∇m Rijkl + ∇k Rijlm + ∇l Rijmk = 0

(238)

Note the cyclic change in the indices klm. The curvature tensor is Rijkl = ∂k Γijl − ∂l Γijk + Γajl Γiak − Γajk Γial

(239)

When we take the covariant derivative of this there will be five terms in the result. The first is a partial derivative and the rest involve inner products of the tensor with the affinity. When we use geodesic coordinates terms involving the affinity will vanish but we cannot neglect partial derivatives of the affinity. Thus we obtain ∇m Rijkl = ∂m ∂k Γijl − ∂m ∂l Γijk

(240)

Then ∇m Rijkl + ∇k Rijlm + ∇l Rijmk = ∂m ∂k Γijl − ∂m ∂l Γijk

(241)

+∂k ∂l Γijm − ∂k ∂m Γijl +∂l ∂m Γijk − ∂l ∂k Γijm

= 0

since partial differentiation is commutative. We have proved these identities using particular coordinates, the geodesic coordinates. However as these are tensor identities they must hold in all coordinate systems at the particular point at which the geodesic coordinates are used. There is nothing special about this point, i.e. we can find geodesic coordinates for any point. Therefore the identities hold everywhere. 13

See Appendix B ‘Odds and ends’ note 11.

64

May 11, 2004 5:16pm

11

PROPERTIES OF THE CURVATURE TENSOR

11.3

Symmetries of curvature tensor

The curvature tensor is Rijkl = ∂k Γijl − ∂l Γijk + Γajl Γiak − Γajk Γial

(242)

It is skew-symmetric in the indices k and l. If the affinity is symmetric then we can use geodesic coordinates. It follows that Rijkl = ∂k Γijl − ∂l Γijk

(243)

and Rijkl + Riklj + Riljk = ∂k Γijl − ∂l Γijk

(244)

+∂l Γikj − ∂j Γikl +∂j Γilk − ∂k Γilj

= 0 Note the cyclic change in the indices jkl.

Now choosing the metric affinity it is convenient to express the rest of the symmetry properties in terms of the fully covariant curvature tensor. This is obtained by lowering the contravariant index into the first covariant position. Rijkl = gia Rajkl

(245)

The metric affinity is symmetric and we can choose geodesic coordinates. Using these we obtain Rijkl = gia (∂k Γajl − ∂l Γajk )

(246)

Since ∂k gij = 0 and ∂k g ij = 0 when we use geodesic coordinates ∂m Γijk = =

1 2 1 2

h

∂m g ia (∂k gja + ∂j gak − ∂a gjk )

i

(247)

g ia (∂m ∂k gja + ∂m ∂j gak − ∂m ∂a gjk )

Then Rijkl = = = =

1 2 1 2 1 2 1 2

gia g ab (∂k ∂l gjb + ∂k ∂j gbl − ∂k ∂b gjl − ∂l ∂k gjb − ∂l ∂j gbk + ∂l ∂b gjk ) (248) δib (∂k ∂l gjb + ∂k ∂j gbl − ∂k ∂b gjl − ∂l ∂k gjb − ∂l ∂j gbk + ∂l ∂b gjk ) (∂k ∂l gji + ∂k ∂j gil − ∂k ∂i gjl − ∂l ∂k gji − ∂l ∂j gik + ∂l ∂i gjk ) (∂k ∂j gil − ∂k ∂i gjl − ∂l ∂j gik + ∂l ∂i gjk )

The following symmetry properties are easily seen from this expression. Rijkl = −Rjikl = −Rijlk = Rklij

skew-symmetric in ij

(249)

skew-symmetric in kl unchanged by swap of ij and kl 65

May 11, 2004 5:16pm

11

PROPERTIES OF THE CURVATURE TENSOR

These symmetry properties considerably reduce the number of independent components in 1 N 2 (N 2 −1) independent the curvature tensor. By a careful count we find in R N there are 12 components. • In R2 there is only one independent component i.e. R 1212 . Remembering the symmetry relations, skew-symmetric in i and j, and skew-symmetric in k and l, we will require i < j and k < l. There is only one possibility. The remaining symmetry relations add nothing further. • In R3 there are six non-zero independent components of the curvature tensor R ijkl . Remembering the symmetry relations, skew-symmetric in i and j, and skew-symmetric in k and l, we will require i < j and k < l. Also we have R ijkl = Rklij . So to avoid duplication we will require i ≤ k and j ≤ l. This gives R1212

R1213

R1223

R1313

R1323

R2323

(250)

We have not used the remaining symmetry property Rijkl + Riklj + Riljk = 0

(251)

but this adds nothing further in this case. This is because the components listed above all have at least one coordinate that is repeated. • In R4 there are twenty independent components out of a possible 4 4 = 256.

66

May 11, 2004 5:16pm

11

PROPERTIES OF THE CURVATURE TENSOR

11.4

Summary

Curvature tensor

In RN there are

Rijkl = ∂k Γijl − ∂l Γijk + Γajl Γiak − Γajk Γial

(252)

Rijkl = gia Rajkl

(253)

∇m Rijkl + ∇k Rijlm + ∇l Rijmk = 0

(254)

Rijkl + Riklj + Riljk = 0

(255)

Rijkl = −Rijlk = −Rjikl = Rklij

(256)

1 2 2 12 N (N

− 1) independent components.

In R2 there is only one independent component i.e. R 1212 . In R3 there are six independent components i.e. R 1212 , R1213 , R1223 , R1313 , R1323 , R2323 .

67

May 11, 2004 5:16pm

12

RICCI AND EINSTEIN TENSORS

12 12.1

Ricci and Einstein tensors Ricci tensor

We will show that the Ricci tensor is symmetric in a Riemannian space.

First of all we will prove a result for the contracted metric affinity: Γiij = ∂j (ln

√

g)

(257)

where g = det(gij ). Then g δil = Gkl gik

(258)

where Gkl is the cofactor of element glk . Hence ∂g ∂j gik ∂gik = Gki ∂j gik

∂j g =

(259)

We know that (g ik ) is the inverse matrix of (gik ) and therefore Gki = g g ki

(260)

∂j g = g g ki ∂j gik

(261)

giving

By Ricci’s Theorem ∇j gik = ∂j gik − Γaij gak − Γakj gia

(262)

∂j g = g g ki (Γaij gak + Γakj gia )

(263)

= 0

giving

= = = It is neater to write this in terms of

g (Γaij δai + Γakj g (Γiij + Γkkj ) 2g Γiij

δak )

√ g.

√ ∂j ( g) =

1 √ ∂j g 2 g √ i g Γij = 68

(264)

May 11, 2004 5:16pm

12

RICCI AND EINSTEIN TENSORS

Rearrange as 1 √ √ ∂j ( g) g √ = ∂j (ln g)

Γiij =

(265)

Therefore ∂k Γiij = ∂k ∂j (ln

√ g)

(266)

and it follows that ∂k Γiij is symmetric in j and k.

√

g is a scalar density. The covariant derivative of a scalar density, V , is given by

√ It follows that ∇j ( g) = 0.

∇j V = ∂j V − V Γiij

(267)

The Ricci tensor is given by Rjk = Rijki

(268)

= ∂k Γiji − ∂i Γijk + Γiak Γaji − Γiai Γajk When the affinity is symmetric then all terms on the right are symmetric in j and k except the first. For the metric affinity we have shown that this term is also symmetric. It follows that the Ricci tensor is symmetric in a Riemannian space. In R4 the Ricci tensor will have ten independent components.

12.2

Divergence and Laplacian

The divergence of a tensor field is obtained by taking the covariant derivative and then contracting with respect to the index of differentiation and any of the contravariant indices. Suppose Ai is a contravariant vector then ∇i Ai = ∂i Ai + Γiji Aj 1 √ = ∂i Ai + √ ∂j ( g) Aj g 1 √ = √ ∂i ( g A i ) g

69

(269)

May 11, 2004 5:16pm

12

RICCI AND EINSTEIN TENSORS

If V is a scalar field then Ai = ∂i V is the gradient of V , a covariant vector. Then the associate contravariant vector is A i = g ij Aj = g ij ∂j V . The Laplacian of V is then 1 √ div grad V = √ ∂i ( g g ij ∂j V ) g

(270)

For a diagonal metric in 3 dimensions this formula reduces to 1 ∂ h1 h2 h3 ∂x1 √ where h1 = g11 etc.

∇2 V =

h2 h3 ∂V h1 ∂x1

+

∂ ∂x2

h3 h1 ∂V h2 ∂x2

+

∂ ∂x3

h1 h2 ∂V h3 ∂x3

(271)

As an example consider spherical polar coordinates (r, θ, φ) with h 1 = 1, h2 = r and h3 = r sin θ. Then 2

∇ V

= =

12.3

∂ 1 ∂V 1 ∂V ∂ ∂V ∂ r 2 sin θ + sin θ + r 2 sin θ ∂r ∂r ∂θ ∂θ ∂φ sin θ ∂φ 1 ∂V 1 1 ∂ ∂ ∂2V 2 ∂V r + sin θ + r 2 ∂r ∂r r 2 sin θ ∂θ ∂θ r 2 sin2 θ ∂φ2

(272)

Einstein tensor

The mixed Ricci tensor is given by Rkm = g mj Rjk

(273)

Since the covariant Ricci tensor is symmetric we do not need to offset the indices in the mixed tensor. The contraction of the mixed Ricci tensor is known as the Ricci scalar. m R = Rm

(274)

= g

mj

Rjm

= g

mj

Rmj

Consider the divergence of the mixed Ricci tensor ∇m Rkm = g mj ∇m Rjk = g

mj

= g

mj

(275)

l

∇m R jkl

g li ∇m Rijkl

= g mj g li (−∇k Rijlm − ∇l Rijmk )

= g

mj

= g

mj li

li

g (−∇k Rlmij − ∇l Rmkij ) li

g (∇k Rmlij − ∇l Rmkij )

= g (∇k R

j

lij

− ∇l R

j

Bianchi symmetry symmetry

kij )

= g li (∇k Rli − ∇l Rki ) = ∇k R − ∇l Rkl

= ∂k R − ∇m Rkm

dummy 70

May 11, 2004 5:16pm

12

RICCI AND EINSTEIN TENSORS

Therefore ∇m Rkm =

1 2

∂k R

(276)

δji R

(277)

The Einstein tensor is defined to be Gij = Rji −

1 2

The covariant Einstein tensor is symmetric. The divergence of the Einstein tensor is zero. ∇i Gij

= ∇i Rji −

= ∇i Rji −

1 2 1 2

δji ∂i R

(278)

∂j R

= 0 Also

G = Gii = =

Rii − 21 − 21 (N

(279) δii

R

− 2) R

where N is the dimension of the space.

71

May 11, 2004 5:16pm

12

RICCI AND EINSTEIN TENSORS

12.4

Summary

Ricci tensor

Rjk = Rijki =

∂k Γiji

(280) −

∂i Γijk

+

Γiak Γaji

−

Γiai Γajk

It is symmetric in a Riemannian space. This follows from Γiij = ∂j (ln

√ g)

(281)

Divergence and Laplacian

1 √ ∇i Ai = √ ∂i ( g A i ) g

(282)

√ 1 div grad V = √ ∂i ( g g ij ∂j V ) g

(283)

Rji = g ai Raj

(284)

R = Rii

(285)

Einstein tensor

1 2

∇i Rji =

Gij = Rji −

∂j R

(286)

1 2

(287)

δji R

∇i Gij = 0 G = Gii =

− 21

(288) (289)

(N − 2) R

72

May 11, 2004 5:16pm

13

SPECIAL SPACES

13 13.1

Special spaces Two-dimensional spaces

The curvature tensor for a two-dimensional Riemannian space can always be written as Rijkl = K (gik gjl − gil gjk )

(290)

where K is the Riemann scalar. In general K depends on the coordinates. The only independent component is R 1212 so that R1212 = K (g11 g22 − g12 g21 )

(291)

= Kg

It follows that the Ricci tensor is given by Rjk = Rajka

(292)

= g ab Rbjka = g ab K (gbk gja − gba gjk ) = K (δka gja − δaa gjk ) = K (gjk − 2 gjk ) = −K gjk

and the Ricci scalar by Rkk = g kj Rjk = −K g = −K

kj

(293) gjk

δkk

= −2K

The Einstein tensor is zero in a two-dimensional Riemannian space Gij

= Rij −

1 2

gij R

= −K gij −

1 2

(294)

gij (−2K)

= 0

If K is constant throughout the space then the space has constant curvature.

13.2

Spherical surface in two-dimensions

The metric for a two-dimensional spherical surface of radius a is h

(ds)2 = a2 (dθ)2 + sin2 θ (dφ)2 73

i

(295) May 11, 2004 5:16pm

13

SPECIAL SPACES

The metric affinities are Γ111 = 0 Γ112 = Γ121 = 0 θ Γ211 = 0 Γ212 = Γ221 = cos sin θ

Γ122 = − sin θ cos θ Γ222 = 0

(296)

The curvature tensor is Rijkl = ∂k Γijl − ∂l Γijk + Γajl Γiak − Γajk Γial

(297)

In a two-dimensional space there is only one independent component, R 1212 , R1212 = g1a Ra212

(298)

For the case of a spherical surface the metric is diagonal so that R1212 = g11 R1212

(299)

h

= g11 ∂1 Γ122 − ∂2 Γ121 + Γa22 Γ1a1 − Γa21 Γ1a2 Now substituting for the affinities gives h

R1212 = g11 ∂1 Γ122 − Γ221 Γ122 = a2

i

i

(300)

∂ (− sin θ cos θ) − ∂θ

h

cos θ (− sin θ cos θ) sin θ

= a2 − cos2 θ + sin2 θ + cos2 θ = a2 sin2 θ

i

Since R1212 = K g and g = a4 sin2 θ it follows that K =

1 a2 .

Since R1212 6= 0 the space is curved. It has constant curvature. When a is small, the curvature is large. However for large a, K tends to zero, and the curvature is small.

For a diagonal metric in two-dimensions the following formula gives R 1212 directly in terms of the metric. R1212 = For the spherical surface R1212 =

− 21

2

a sin θ

− 12

"

√ g ∂1

1 √ ∂1 g22 g

!

+ ∂2

1 √ ∂2 g11 g

!#

(301)

√ g = a2 sin θ therefore

∂ ∂θ

1 ∂ 2 2 ∂ a sin θ + a2 sin θ ∂θ ∂φ

1 ∂ 2 a a2 sin θ ∂φ

(302)

(303)

The second term on rhs gives zero and R1212 =

− 21

2

a sin θ

= a2 sin2 θ 74

∂ (2 cos θ) ∂θ

May 11, 2004 5:16pm

13

SPECIAL SPACES

13.3

Spaces with constant curvature

In a space with constant curvature the curvature tensor can be written as Rijkl = K (gik gjl − gil gjk )

(304)

where K, the Riemann scalar, is a constant. It follows that the Ricci tensor is given by Rjk = K (1 − N ) gjk

(305)

Rkk = K (1 − N ) N

(306)

and the Ricci scalar by

An example of a 3-dimensional space with positive curvature is 2

2

2

2

(ds) = (dr) + a sin In this case you find K =

h r

a

(dθ)2 + sin2 θ (dφ)2

i

(307)

1 a2 .

An example of a 3-dimensional space with negative curvature is 2

2

2

2

(ds) = (dr) + a sinh

h r

a

(dθ)2 + sin2 θ (dφ)2

i

(308)

In this case you find K = − a12 . A flat space of course has constant curvature K = 0.

75

May 11, 2004 5:16pm

13

SPECIAL SPACES

13.4

Summary

Two-dimensional spaces

Rijkl = K (gik gjl − gil gjk )

(309)

R1212 = K g

(310)

Rij = −K gij

(311)

Rii = −2K

(312)

Gij = 0

(313)

Rijkl = K (gik gjl − gil gjk )

(314)

Rij = −K (N − 1) gij

(315)

Rii = −K (N − 1) N

(316)

Spaces with constant curvature

Gij =

1 2

K (N − 2)(N − 1) gij

76

(317)

May 11, 2004 5:16pm

14

EXAMPLES OF CURVED TWO-DIMENSIONAL SURFACES

14 14.1

Examples of curved two-dimensional surfaces Paraboloid

In this example we will consider the curvature of a paraboloid. This is obtained by rotating a parabola in the x–z plane about the z–axis. 14.1.1

Paraboloidal coordinates

Paraboloidal coordinates (α, β, φ) are related to Cartesian coordinates (x, y, z) by x = αβ cos φ

y = αβ sin φ

z = 21 (α2 − β 2 )

(318)

where φ ∈ [0, 2π), α ≥ 0 and β ≥ 0. The dependence on φ is characteristic of coordinates that have rotational symmetry about an axis. p

Let ρ = x2 + y 2 = αβ which is independent of φ. Consider the ρ–z plane with ρ ≥ 0. We take β to be constant and eliminate α giving an equation relating ρ and z ρ2 = 2β 2 (z + 21 β 2 )

(319)

This is a parabola. 14.1.2

Properties of a parabola

A parabola is the locus of points equidistant from a fixed point, the focus, and a given straightline, the directrix. It has eccentricity e = 1. The straightline through the focus and perpendicular to the directrix is the axis. The axis intersects the parabola at the vertex. The straightline joining the the focus and the parabola that is perpendicular to the axis is the semi-latus rectum. The distance from the focus to the vertex is half the semi-latus rectum. Usually a parabola is discussed using the equation y 2 = 4ax. In this case the focus is at (a, 0), the vertex is at (0, 0), the directrix is the line x = −a and the axis is the line y = 0. See figure 1. If the vertex is moved to the point (x 0 , y0 ) then the equation becomes (y − y0 )2 = 4a(x − x0 ) 14.1.3

(320)

Coordinate surface — paraboloid

The parabola ρ2 = 2β 2 (z + 12 β 2 )

77

(321)

May 11, 2004 5:16pm

14

EXAMPLES OF CURVED TWO-DIMENSIONAL SURFACES

10 8 6 4 2 axis

0 vertex

focus

−2 −4 −6 directrix

−8 −10 −10

−8

−6

−4

−2

0

2

4

6

8

10

Figure 1: Parabola y 2 = 4ax with a = 2 is symmetric about the z–axis and crosses the z–axis at z = − 12 β 2 . This point is the vertex of the parabola. When z = 0 we have ρ = ±β 2 . Therefore the origin is the focus of the parabola. The parameter α ranges from 0 to ∞. As α increases a point on the curve moves in the positive ρ direction with α = 0 corresponding to the vertex. As β varies we generate a family of parabolas with a common focus (confocal), namely the origin. The curves for different β do not overlap. As β increases the vertex moves to a more negative value of z and the parabola opens out. See figure 2. The case β = 0 is special. Returning to the original coordinate definition we see that this corresponds to x = 0, y = 0, z = 12 α2 . As the parameter α varies this will give the positive z–axis including the origin when α = 0. Thus any point in the ρ–z plane can be allocated coordinates α and β. The value of β > 0 identifies a parabola and the α ≥ 0 value gives the position on it. For the origin and positive z–axis we have β = 0 and α ≥ 0. We have not considered φ. Our treatment is independent of φ. This corresponds to symmetry about the z–axis. Therefore when β > 0 is constant, it corresponds to the surface of revolution obtained by rotating the coordinate curve in the ρ–z plane about the z–axis. 78

May 11, 2004 5:16pm

14

EXAMPLES OF CURVED TWO-DIMENSIONAL SURFACES

10 .1

.2

.5 1

5

2

z

0 3

−5

0

1

2

3

4

5

6

7

8

9

10

rho

Figure 2: Family of confocal parabolas labelled by β This gives a paraboloid of revolution. 14.1.4

Line-element

To determine the line-element we use −→ dx = β cos φ dα + α cos φ dβ − α β sin φ dφ

x = αβ cos φ

−→ dy = β sin φ dα + α sin φ dβ + α β cos φ dφ

y = αβ sin φ z=

2 1 2 (α

(322)

2

− β ) −→ dz = α dα − β dβ

After some algebra the 3-dimensional line-element in terms of (α, β, φ) is (ds)2 = (dx)2 + (dy)2 + (dz)2 2

2

2

2

(323) 2

2

2 2

= (α + β ) (dα) + (α + β ) (dβ) + (α β ) (dφ)

2

This metric is diagonal, corresponding to orthogonal curvilinear coordinates. When β is constant, dβ = 0, and we obtain the line-element for the 2-dimensional surface of a paraboloid of revolution. (ds)2 = (α2 + β 2 ) (dα)2 + (α2 β 2 ) (dφ)2 79

(324) May 11, 2004 5:16pm

14

EXAMPLES OF CURVED TWO-DIMENSIONAL SURFACES

14.1.5

Determination of R1212 and K

For a diagonal metric in 2-dimensions the following formula gives R 1212 directly in terms of the metric. R1212 =

− 21

"

√ g ∂1

For the paraboloid of revolution therefore R1212

√

=

αβ

q

+ ∂2

1 √ ∂2 g11 g

!#

(325)

p

α2

1 √ ∂1 g22 g +

β2

∂ ∂α

∂ = − 21 αβ α2 + β 2 ∂α q

=

!

g = αβ α2 + β 2 and the metric is independent of φ,

√ = − 21 g ∂1 − 21

1 √ ∂1 g22 g

α2 β 2 α2 + β 2

!

(326) ∂ 2 2 1 p α β αβ α2 + β 2 ∂α 2β p 2 α + β2

!

!

Since R1212 = Kg it follows that K=

14.1.6

(α2

1 + β 2 )2

(327)

Curvature of paraboloid

In discussing curvature we must exclude β = 0. This is not a surface. Therefore for fixed β > 0 the maximum curvature will occur when α = 0. This is at the vertex of the parabola. Considering now variation of β we want small values of β for large curvature. These are parabolas whose vertex approach the focus at the origin. These are very narrow parabolas with a sharp turning at the vertex. As β increases from small values the parabola opens out giving a wider base with decreased curvature. Large values of α correspond to well away from the vertex, the wings of the parabola. In this region the parabola becomes flatter.

If we had held α constant rather than β then the family of parabolas would simply be flipped upside down. The smiley faces change into sad ones.

80

May 11, 2004 5:16pm

14

EXAMPLES OF CURVED TWO-DIMENSIONAL SURFACES

14.2

Ellipsoid

In this example we will consider the curvature of an ellipsoid. This is obtained by rotating an ellipse in the x–z plane about the z–axis. 14.2.1

Prolate spheroidal coordinates

Prolate spheroidal coordinates (α, β, φ) are related to Cartesian coordinates (x, y, z) by x = sinh α sin β cos φ

y = sinh α sin β sin φ

z = cosh α cos β

(328)

where φ ∈ [0, 2π), α ≥ 0 and β ∈ [0, π]. The dependence on φ is characteristic of coordinates that have rotational symmetry about an axis. p

Let ρ = x2 + y 2 = sinh α sin β which is independent of φ. Consider the ρ–z plane with ρ ≥ 0. We take α > 0 to be constant. We can eliminate β and obtain an equation relating ρ and z. This is z2 ρ2 + =1 sinh2 α cosh2 α

(329)

This is an ellipse with centre at the origin. 14.2.2

Properties of an ellipse

An ellipse is the locus of all points whose distance from a fixed point, the focus, is e (< 1) times the distance to a fixed line, the directrix. e is known as the eccentricity. There is another fixed point and line which gives the same ellipse. An ellipse has two foci. The line joining the foci intersects the ellipse at the vertices. The line joining the vertices is the major axis. The midpoint of the major axis is the centre of the ellipse. A line perpendicular to the major axis, through the centre, intersects the ellipse at two points. The line joining the two intersections is the minor axis. A line perpendicular to the major axis, through a focus, intersects the ellipse at two points. The line joining the two intersections is the latus rectum. Usually an ellipse is discussed using the equation x2 y 2 + 2 =1 a2 b

(330)

with a > b. The centre of the ellipse is (0, 0), the major axis lies on the x–axis and has length 2a, the minor axis lies on the y–axis and has length 2b, the foci are at (±ae, 0), the vertices at (±a, 0) and the directrices are the lines x = ±a/e. The eccentricity is given by b2 = a2 (1 − e2 )

or

e2 = 1 −

b2 a2

(331)

See figure 3. 81

May 11, 2004 5:16pm

14

EXAMPLES OF CURVED TWO-DIMENSIONAL SURFACES

10

a=6 b= 4.8 e=.6 minor axis

5

major axis

0

focus

vertex

centre

focus

−5

−10

directrix

directrix −10

−5

0

Figure 3: Ellipse 14.2.3

5

x2 a2

+

y2 b2

10

=1

Coordinate surface — ellipsoid

Now consider z2 ρ2 + =1 sinh2 α cosh2 α

(332)

Since cosh α > sinh α the major axis is along the z–axis. This is the meaning of prolate, the coordinate surface is stretched along the z–axis. Oblate coordinates have surfaces stretched in the x–y plane. Since a = cosh α and b = sinh α we have e=

1 cosh α

and

ae = 1

(333)

Therefore the curve is an ellipse with centre at the origin and foci at z = ±1. The parameter β ranges from 0 to π β = 0 −→ (ρ, z) = (0, cosh α)

β=

π 2

positive end of major axis

−→ (ρ, z) = (sinh α, 0)

(334)

positive end of minor axis

β = π −→ (ρ, z) = (0, − cosh α) 82

negative end of major axis May 11, 2004 5:16pm

14

EXAMPLES OF CURVED TWO-DIMENSIONAL SURFACES

Thus as β increases from 0, the point starts on the positive z–axis and moves clockwise around the ellipse. As α > 0 varies we generate a family of ellipses with common foci (confocal) and centre. The curves for different α do not overlap. As α increases the ellipse enlarges and the eccentricity tends to zero, giving a circle. See figure 4. 5 4

x marks foci at z=−1 and z=+1

3 2 1 z 0

alpha e .2 .5 1 1.5 2

.98 .89 .65 .43 .27

.2

.5

1.5

1

2

−1 −2 −3 −4 −5 −5

−4

−3

−2

−1

0

1

2

3

4

5

rho

Figure 4: Family of confocal ellipses labelled by α The case α = 0 is special. Returning to the original coordinate definition we see that this corresponds to x = 0, y = 0, z = cos β. As the parameter β varies this will give a straightline along the z–axis joining the points z = 1 and z = −1. Thus any point in the ρ–z plane can be allocated coordinates α and β. The value of α > 0 identifies an ellipse and the 0 ≤ β ≤ π value gives the position on it. For the line joining z = 1 to z = −1 we have α = 0 and 0 ≤ β ≤ π. We have not considered φ. Our treatment is independent of φ. This corresponds to symmetry about the z–axis. Therefore when α > 0 is constant, it corresponds to the surface of revolution obtained by rotating the coordinate curve in the ρ–z plane about the z–axis. This gives an ellipsoid of revolution.

83

May 11, 2004 5:16pm

14

EXAMPLES OF CURVED TWO-DIMENSIONAL SURFACES

14.2.4

Line-element

To determine the line-element we use the differentials of x = sinh α sin β cos φ

y = sinh α sin β sin φ

z = cosh α cos β

(335)

dx = cosh α sin β cos φ dα + sinh α cos β cos φ dβ − sinh α sin β sin φ dφ

(336)

dy = cosh α sin β sin φ dα + sinh α cos β sin φ dβ + sinh α sin β cos φ dφ dz = sinh α cos β dα − cosh α sin β dβ

After some algebra the 3-dimensional line-element in terms of (α, β, φ) is (ds)2 = (dx)2 + (dy)2 + (dz)2 2

(337)

2

2

2

2

2

2

2

= (sinh α + sin β) (dα) + (sinh α + sin β) (dβ) + (sinh α sin β) (dφ)2 This metric is diagonal, corresponding to orthogonal curvilinear coordinates. When α is constant, dα = 0, and we obtain the line-element for the 2-dimensional surface of an ellipsoid of revolution. (ds)2 = (sinh2 α + sin2 β) (dβ)2 + (sinh2 α sin2 β) (dφ)2 14.2.5

(338)

Determination of R1212 and K

√ For the ellipsoid of revolution g = sinh α sin β sinh2 α + sin2 β and the metric is independent of φ, therefore q

R1212 =

− 21

√ g ∂1

1 √ ∂1 g22 g

= − 12 sinh α sin β ×



!

(339)

q

sinh2 α + sin2 β

q

∂  2 sinh α cos β  q sinh2 α + sin2 β ∂β sinh2 α + sin2 β



∂  1 ∂ q sinh2 α sin2 β  ∂β sinh α sin β sinh2 α + sin2 β ∂β

= − 21 sinh α sin β 2

= − sinh α sin β 

sin β

q





sinh2 α + sin2 β

sin β cos2 β



 × − q − 3 2 2 2 2 2 (sinh α + sin β) sinh α + sin β

= =

i sinh2 α sin2 β h 2 2 2 sinh α + sin β + cos β sinh2 α + sin2 β sinh2 α cosh2 α sin2 β sinh2 α + sin2 β

84

May 11, 2004 5:16pm

14

EXAMPLES OF CURVED TWO-DIMENSIONAL SURFACES

Since R1212 = Kg it follows that K=

14.2.6

cosh2 α (sinh2 α + sin2 β)2

(340)

Curvature of ellipsoid

In discussing curvature we must exclude α = 0. This is not a surface. dK dβ

Therefore for fixed α > 0 the stationary values of curvature will occur when gives

= 0. This

cos β sin β = 0

(341)

Therefore curvature is stationary when β = 0, end of major axis, and at β = π2 , end of minor axis. A sketch of K as a function of β shows that the maximum curvature is at the end of the major axis and the minimum curvature is at the end of the minor axis. See figure 5. 3 .8

2.5

alpha e .8 .9 1 1.5 2

2

.75 .70 .65 .43 .27

.9

K 1.5 1 1

0.5 1.5 0

2 0

0.5

1

1.5

2

2.5

3

3.5

beta

Figure 5: K as a function of β for various α The curvature ranges from a maximum of Kmax =

cosh2 α sinh4 α 85

(342) May 11, 2004 5:16pm

14

EXAMPLES OF CURVED TWO-DIMENSIONAL SURFACES

to a minimum of Kmin =

1 cosh2 α

(343)

When α is small, sinh α approaches zero and the curvature becomes large. The ellipse looks like a pencil or a very thin Cookstown sausage. As α increases from small values the ellipse becomes like a circle and sinh α −→ cosh α

giving

K −→

1 cosh2 α

(344)

corresponding to a circle of radius cosh α.

14.3

Hyperboloid

This case can be treated by considering prolate spheroidal coordinates (α, β, φ) and letting β be constant. As before we have x = sinh α sin β cos φ

y = sinh α sin β sin φ

z = cosh α cos β

(345)

where φ ∈ [0, 2π), α ≥ 0 and β ∈ [0, π]. Let ρ =

p

x2 + y 2 = sinh α sin β. Consider the ρ–z plane with ρ ≥ 0.

We can eliminate α and obtain an equation relating ρ and z ρ2 z2 − =1 cos2 β sin2 β

(346)

This is a hyperbola with centre at the origin. 14.3.1

Properties of a hyperbola

A hyperbola is the locus of all points whose distance from a fixed point, the focus, is e (> 1) times the distance to a fixed line, the directrix. e is known as the eccentricity. There is another fixed point and line which gives the same hyperbola. A hyperbola has two foci. Also the curve is in two distinct parts, called branches. The line joining the foci intersects the hyperbola at the vertices. The line joining the vertices is the transverse axis. The midpoint of the transverse axis is the centre of the hyperbola. The line perpendicular to the transverse axis, through the centre, is the conjugate axis. This line does not intersect the hyperbola. A line perpendicular to the transverse axis, through a focus, intersects the hyperbola at two points. The line joining the two intersections is the latus rectum. Usually a hyperbola is discussed using the equation x2 y 2 − 2 =1 a2 b 86

(347) May 11, 2004 5:16pm

14

EXAMPLES OF CURVED TWO-DIMENSIONAL SURFACES

The centre of the hyperbola is (0, 0), the transverse axis lies on the x–axis and has length 2a, the conjugate axis lies on the y–axis and has length 2b, the foci are at (±ae, 0), the vertices at (±a, 0) and the directrices are the lines x = ±a/e. The eccentricity is given by b2 = a2 (e2 − 1)

e2 = 1 +

or

b2 a2

(348)

The hyperbola has two asymptotes given√by y = ±(b/a)x. Of special interest is the rect√ angular hyperbola which has √ a = b, e = 2 and perpendicular asymptotes. When e < 2 then b < a, and when e > 2 then b > a. See figure 6.

a=6 b=6.7 e=1.5

10

5

centre 0

vertex

transverse axis

focus

−5

conjugate axis −10

directrix

asymptote −10

−5

0

Figure 6: Hyperbola

14.3.2

5

x2 a2

−

y2 b2

10

=1

Coordinate surface — hyperboloid

Now consider ρ2 z2 − =1 cos2 β sin2 β

(349)

The transverse axis is along the z–axis.

87

May 11, 2004 5:16pm

14

EXAMPLES OF CURVED TWO-DIMENSIONAL SURFACES

Since a = | cos β| and b = sin β we have e=

1 | cos β|

and

ae = 1

(350)

Therefore the curve is a hyperbola with centre at the origin and foci at z = ±1.

As α increases from 0, the point starts at a vertex, z = cos β, the branch depends on the value of β, and moves along the curve. Remember that the curve is restricted to the half-plane ρ ≥ 0.

As β varies we generate a family of hyperbolas with common foci (confocal) and centre. The curves for different β do not overlap. The following table gives the vertex and eccentricity of the hyperbolas shown in figure 7. β 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0

vertex cos β 0.9801 0.9211 0.8253 0.6967 0.5403 0.3624 0.1700 −0.0292 −0.2272 −0.4161 −0.5885 −0.7374 −0.8569 −0.9422 −0.9900

eccentricity e 1.0203 1.0857 1.2116 1.4353 1.8508 2.7597 5.8835 34.2471 4.4014 2.4030 1.6992 1.3561 1.1670 1.0613 1.0101

We must exclude: • β = 0, which gives ρ = 0 and z = cosh α, a straightline lying on the z–axis with z ≥ 1 • β = π, which gives ρ = 0 and z = − cosh α, a straightline lying on the z–axis with z ≤ −1 • β = π/2, which gives ρ = sinh α and z = 0, a straightline lying on the ρ–axis with ρ≥0 Thus any point in the ρ–z plane can be allocated coordinates α and β. The value of β identifies a hyperbola and the α ≥ 0 value gives the position on it.

We have not considered φ. Our treatment is independent of φ. This corresponds to symmetry about the z–axis. Therefore when β is constant, it corresponds to the surface of revolution obtained by rotating the coordinate curve in the ρ–z plane about the z–axis. This gives a hyperboloid of revolution. 88

May 11, 2004 5:16pm

14

EXAMPLES OF CURVED TWO-DIMENSIONAL SURFACES

5 .2

4 3 2 1 0

z

beta varies from .2 to 3 in steps of .2 1.6

−1 −2 −3 3

−4 −5 −5

−4

−3

−2

−1

0

1

2

3

4

5

rho

Figure 7: Family of confocal hyperbolas labelled by β 14.3.3

Line-element

The 3-dimensional line-element in terms of (α, β, φ) is (ds)2 = (dx)2 + (dy)2 + (dz)2 2

(351)

2

2

2

2

2

2

2

= (sinh α + sin β) (dα) + (sinh α + sin β) (dβ) + (sinh α sin β) (dφ)2 When β is constant, dβ = 0, and we obtain the line-element for the 2-dimensional surface of a hyperboloid of revolution. (ds)2 = (sinh2 α + sin2 β) (dα)2 + (sinh2 α sin2 β) (dφ)2 14.3.4

(352)

Determination of R1212 and K

For the hyperboloid of revolution independent of φ, therefore R1212

√ = − 21 g ∂1

√

q

g = sinh α sin β sinh2 α + sin2 β and the metric is

1 √ ∂1 g22 g

= − 12 sinh α sin β

q

!

(353)

sinh2 α + sin2 β 89

May 11, 2004 5:16pm

14

EXAMPLES OF CURVED TWO-DIMENSIONAL SURFACES

×





∂  1 q sinh2 α sin2 β  ∂α sinh α sin β sinh2 α + sin2 β ∂α ∂

= − 21 sinh α sin β 2

= − sinh α sin β 

× q

sinh α

q

q





∂  2 cosh α sin β  q sinh2 α + sin2 β ∂α sinh2 α + sin2 β

sinh2 α + sin2 β

2

2

sinh α + sin β

−

sinh α cosh2 α 3

(sinh2 α + sin2 β) 2

 

i sinh2 α sin2 β h 2 2 2 sinh α + sin β − cosh α sinh2 α + sin2 β sinh2 α cos2 β sin2 β sinh2 α + sin2 β

= − =

Since R1212 = Kg it follows that K=

14.3.5

cos2 β (sinh2 α + sin2 β)2

(354)

Curvature of hyperboloid

In discussing curvature we must exclude β = 0, π. These are not surfaces. Therefore for fixed β the stationary values of curvature will occur when cosh α sinh α = 0

dK dα

= 0. This gives (355)

Therefore curvature is stationary when α = 0, which is the vertex. This is a position of maximum curvature with Kmax =

cos2 β sin4 β

(356)

When sin β is small, i.e. β is near 0 or π, K max takes large values. When cos β is small, i.e. β is near π/2, Kmax is small. When α is large, K approaches zero.

90

May 11, 2004 5:16pm

15

CARTESIAN TENSORS

15 15.1

Cartesian tensors Introduction

Cartesian tensors are a special case in which we restrict the type of coordinates to be Cartesian in a Euclidean space EN . Despite the restriction Cartesian tensors are still useful and benefit from a number of simplifications. 1. There is no distinction between covariant and contravariant Cartesian tensors. 2. The notation only requires subscripts for the components. 3. The summation convention is modified so that repeated subscripts implies a summation. 4. The metric is the Kronecker delta δ ij in all Cartesian coordinate systems. 5. We only allow coordinate transformations that preserve the Cartesian form of the line element. xi −→ xi

is such that

x i xi = x i xi

(357)

6. The required transformation is xi = aij xj

with

aij aik = δjk

(358)

where aij is independent of coordinate and the matrix (a ij ) is an orthogonal matrix. 7. The inverse transformation is xi = aji xj

(aij )−1 = (aij )T = (aji )

since

8. The coefficients that arise in the transformation of Cartesian tensors are ∂xi ∂xj = = aij ∂xj ∂xi

(359)

(360)

9. The position xi is a Cartesian vector. 10. The Jacobian is det(aij ) = ±1.

A proper transformation has det(aij ) = 1. An improper transformation has det(a ij ) = −1.

11. Relative Cartesian tensors only have weight 0 or 1. A weight 1 tensor is called a Cartesian tensor density. 12. The Kronecker delta δij is an isotropic Cartesian tensor of rank 2. 13. The permutation symbol ei1 i2 ...iN is an isotropic Cartesian tensor density of rank N in EN . 14. Since the metric is constant it follows that the metric affinity is zero. Covariant differentiation reduces to partial differentiation. 91

May 11, 2004 5:16pm

15

CARTESIAN TENSORS

15.2

Orthogonal transformations

15.2.1

Orthogonality conditions

A linear, homogeneous transformation of the coordinates is given by: xi = aij xj

(361)

or x = Ax in matrix form where A has elements a ij and x is a column vector. This transformation leaves the origin unchanged. We are interested in a transformation which leaves s2 = x i xi

(362)

unchanged or invariant. s2 is the distance squared of the point from the origin. This transformation would then correspond to a rotation or reflection in E 3 . 0 = x i xi − x i xi

(363)

= aij xj aik xk − xi xi

= (aij aik − δjk ) xj xk

Since this holds for all xj then aij aik = δjk

(364)

which follows from the quantity in brackets being symmetric. These are known as the orthogonality conditions. The orthogonality conditions give N (N + 1)/2 constraints on the N 2 parameters aij that define the transformation. There are therefore N (N − 1)/2 independent parameters representing the transformation. 15.2.2

Orthogonal matrices

In matrix form the orthogonality conditions can be written as A T A = I. This shows that the columns of A are orthonormal. 14 Consider 1 = det(I)

(365)

T

= det(A A) = det(AT ) det(A) = det(A)2 Therefore det(A) = ±1 14 The inner product of a column with itself gives unity, while the inner product with another column gives zero.

92

May 11, 2004 5:16pm

15

CARTESIAN TENSORS

Since det(A) is non-zero the inverse exists and

AT = AT AA−1 = AT A A−1 = A−1

(366)

It follows that AAT = I giving aji aki = δjk . These conditions show that the rows of A are orthonormal. A is an orthogonal matrix and the corresponding transformation is called orthogonal. When det(A) = 1 the transformation is called proper ( e.g. a rotation in E 3 ).

When det(A) = −1 the transformation is improper ( e.g. reflection in E 3 ). 15.2.3

Orthogonal group

The orthogonal transformations form a group. This can be demonstrated by showing that the product of two orthogonal transformations is itself an orthogonal transformation (closure property of groups) i.e. satisfies the orthogonality conditions. Let cik = aij bjk be the product of two orthogonal transformations. Then cik cil = aij bjk aim bml

(367)

= δjm bjk bml = bmk bml = δkl and the c coefficients satisfy the orthogonality conditions. Also the product is associative, each transformation has an inverse and the identity transformation is orthogonal. Proper orthogonal transformations, which contain the identity transformation, form a subgroup. 15.2.4

Rotations in 2 dimensions

When N = 2, a 2-dimensional plane, there is 1 parameter. For proper transformations, det(A) = 1, this is an angle of rotation θ. The general form of A is A=

cos θ sin θ − sin θ cos θ

!

(368)

This corresponds to a positive 15 rotation in the x − y plane about the z-axis in a righthanded Cartesian coordinate system. x = cos θ x + sin θ y

(369)

y = − sin θ x + cos θ y 15

A positive rotation is anti-clockwise about the axis, i.e. it follows the right-hand corkscrew rule.

93

May 11, 2004 5:16pm

15

CARTESIAN TENSORS

Note that the position vector r remains unchanged and we have rotated the coordinate system. This is the convention we will adopt. An equivalent viewpoint would be a clockwise rotation of r with the coordinate system held fixed. It is easy to show that two successive rotations by θ 1 and θ2 produces a rotation by θ1 + θ2 . This shows that the transformations are commutative in this case. 15.2.5

Rotations in 3 dimensions

When N = 3 there are 3 parameters. For proper transformations, det(A) = 1, these can be taken to be the Euler angles. 16 The Euler angles are (α, β, γ). There are three successive rotations. Starting with (x, y, z) apply rotation α about the z-axis. This gives (x 1 , y1 , z). Now apply rotation β about the x1 -axis. This gives (x1 , y2 , z2 ). Finally apply rotation γ about the z2 -axis. This gives (x3 , y3 , z2 ). Alternatively the 3 parameters can be taken to be an angle θ about a direction n. Since n = (n1 , n2 , n3 ) is a unit vector then it corresponds to 2 parameters since n21 + n22 + n23 = 1

(370)

The general form of A for a rotation in E 3 is aij = ni nj (1 − cos θ) + cos θ δij + eijk nk sin θ

(371)

If n = (0, 0, 1) i.e. the z-axis then A becomes 



cos θ sin θ 0   A =  − sin θ cos θ 0  0 0 1

(372)

It is straightforward to show that: a ij nj = ni i.e. n is a vector unchanged by the transformation. This should be recognised as an eigenvalue equation i.e. (aij − δij ) nj = 0

(373)

Then n is an eigenvector of A with eigenvalue 1. 3-dimensional rotations in general do not commute. 15.2.6

Eigenvalues of orthogonal matrices

In general the eigenvalues of a real orthogonal matrix are such that | λ |= 1 i.e. they have unit modulus and we can write them as λ = exp(ix). This follows because the solutions of the secular equation det(A − λI) = 0

(374)

16 See ‘Classical Mechanics’ by Goldstein for a discussion on the various conventions. Also ‘Classical Mechanics’ by Corben and Stehle.

94

May 11, 2004 5:16pm

15

CARTESIAN TENSORS

can be complex and it follows that the eigenvectors can be complex. AX = λX

(375)

Taking the complex transpose (or hermitian conjugate), denoted by

H,

gives

(AX)H = X H AH H

= X A

(376)

T

= λ∗ X H for a real, orthogonal matrix where the complex conjugate is denoted by ∗ . Then X H AT AX

= X HX

(377) H

∗

= λ λX X giving λ∗ λ = 1 or | λ |= 1 when X H X 6= 0. Also if λ is an eigenvalue then λ∗ is also an eigenvalue, so that the complex eigenvalues occur in conjugate pairs. The secular equation has the form λN + aN −1 λN −1 + . . . + a1 λ + a0 = 0

(378)

where for a real matrix the coefficients a i are real. Taking the complex conjugate of this equation and using (λN )∗ = (λ∗ )N it is clear that λ∗ is also a solution. It follows that when N = 3 ( i.e. in E3 ) at least one of the eigenvalues must be real. Consider (A − I)AT = I − AT and take the determinant of both sides to obtain det(A − I) det(A) = det(I − A)

(379)

N

= (−1) det(A − I)

where we have used det(AT ) = det(A) and (I − A)T = I − AT . Therefore i

h

det(A − I) det(A) − (−1)N = 0

(380)

so that det(A − I) = 0 (1 is an eigenvalue) when det(A) = 1 and N is odd ( e.g. proper transformations in E3 ) or when det(A) = −1 and N is even. 15.2.7

Summary of properties of orthogonal matrices

1. columns are orthonormal 2. rows are orthonormal 3. A−1 = AT 4. det(A) = ±1, +1 for proper, −1 for improper 5. form a group, proper is a subgroup 95

May 11, 2004 5:16pm

15

CARTESIAN TENSORS

15.3

Tensors

We know how the position vector transforms under orthogonal transformations: xi = aij xj

(381)

This has the property of leaving s2 = xi xi unchanged or invariant. Tensors can be regarded as a generalisation of vectors. They are defined by their transformation properties. In an N -dimensional space any set of N t quantities T...i...j... that transform according to: T ...b...c... = . . . abi . . . acj . . . T...i...j...

(382)

are said to be the components of a Cartesian tensor of rank t. There are t indices and aij occurs t times. The transformation law is linear, homogeneous and transitive. The inverse transformation for eq.(381) is xi = aji xj

(383)

which follows from A−1 = AT . Similarly the inverse transformation for eq.(382) is T...b...c... = . . . aib . . . ajc . . . T ...i...j...

(384)

1. A zero rank tensor, known as a scalar, is invariant T = T . 2. A vector is a first rank tensor T i = aij Tj . 3. A second rank tensor transforms as T ij = aik ajm Tkm

(385)

We can write this in matrix form as T = AT AT = AT A−1

(386)

which is a similarity transformation of the matrix T . We can manipulate tensors of rank up to and including 2 as matrices. However a tensor is more than a matrix. It is defined in terms of its transformation properties. We can define the trace, determinant, eigenvalues and eigenvectors of a tensor of rank 2 in the usual way. The trace, determinant and eigenvalues are scalars. For example det(T ) = det(AT AT )

(387) T

= det(A) det(T ) det(A ) = det(T ) 96

May 11, 2004 5:16pm

15

CARTESIAN TENSORS

15.4

Tensor algebra

The algebra of tensors agrees with the general case and includes the following: • Addition and subtraction • Outer product • Inner product • Contraction • Symmetry and skew-symmetry • Quotient rule • Tensor equations

15.5

Tensor densities

A Cartesian tensor density transforms in the same way as a Cartesian tensor except for a factor det(A) on the right-hand side in the transformation law. T ...b...c... = det(A) . . . abi . . . acj . . . T...i...j...

(388)

Since det(A) = ±1 it follows that a tensor density behaves like a tensor for proper transformations when det(A) = 1. 1. The sum or difference of two densities of the same rank is a density of that rank. 2. The outer product of a tensor and a tensor density is a tensor density (since there is det(A) in the transformation law) . 3. The outer product of two tensor densities is a tensor (since there is det(A) 2 in the transformation law). 4. A contracted tensor density is a tensor density (since there is det(A) in the transformation law).

15.6 15.6.1

Isotropic tensors Definition

An isotropic tensor or tensor density has its components unchanged by an orthogonal coordinate transformation. We exclude the tensor in which all the components are zero.

97

May 11, 2004 5:16pm

15

CARTESIAN TENSORS

15.6.2

Isotropic vectors

There is no isotropic tensor of rank 1. To show this suppose that Xi 6= 0 is an isotropic vector. Then it transforms as X i = Xi = δij Xj = aij Xj

(389)

(aij − δij ) Xj = 0

(390)

or

In matrix form this can be written as (A − I) X = 0

(391)

where A is an orthogonal matrix and I is the identity matrix. If det(A − I) 6= 0 then the only solution is X = 0.

We must have det(A − I) = 0 which implies that λ = 1 is an eigenvalue of A and X is the corresponding eigenvector. In general the eigenvalues of an orthogonal matrix are such that | λ |= 1 i.e. we can write them as λ = eix . It does not follow however that every orthogonal matrix has an eigenvalue λ = 1. If Xi is an isotropic vector then λ = 1 is an eigenvalue for all orthogonal matrices. It is quite easy to get a counter example. This is given by the transformation a ij = −δij which are the elements of an orthogonal matrix with eigenvalues λ = −1. 15.6.3

Kronecker delta

Let δij (the Kronecker delta), defined in a particular coordinate system, be a tensor of rank 2. Then it is isotropic since using the orthogonality conditions δ ij = aib ajc δbc

(392)

= aib ajb = δij It is symmetric. 15.6.4

Levi-Civita tensor density

Let eijk (the permutation symbol), defined in a particular coordinate system in E 3 , be a tensor density of rank 3. Then it is isotropic since eijk = det(A) aib ajc akd ebcd

(393)

2

= det(A) eijk = eijk 98

May 11, 2004 5:16pm

15

CARTESIAN TENSORS

It is skew-symmetric. Similarly it can be shown that in EN the corresponding permutation symbol with N indices is a skew-symmetric, isotropic tensor density of rank N . This is known as the Levi-Civita 17 tensor density. 15.6.5

Isotropic tensors in E3

In E3 there is: 1. no isotropic vector 2. 1 isotropic tensor of rank 2, δij 3. 1 isotropic tensor density of rank 3, e ijk 4. isotropic tensors of rank 4 have the form α δij δkl + β δik δjl + γ δil δjk

(394)

where α, β and γ are invariants. 5. isotropic tensor densities of rank 5 are linear combinations of 10 terms of the form eijk δlm . 6. isotropic tensors of rank 6 are linear combinations of 15 terms of the form δ ij δkl δmn .

15.7

Tensor fields and calculus

If a tensor is defined at every point of a region in E N then it is known as a tensor field. When tensor fields are combined together ( i.e. through sum, difference or product) it is assumed that the fields refer to the same point in the space. If V is a scalar field ( i.e. invariant) then V (x1 , . . . , xn ) = V (x1 , . . . , xn )

(395)

where V is not necessarily the same function of the x i that V is of the xi . Consider the derivative and using the chain rule ∂xj ∂V ∂V ∂V = = aij ∂xi ∂xi ∂xj ∂xj where xj = aij xi (this is the inverse of xi = aij xj ). Letting ∂i V = as ∂ i V = aij ∂j V 17

(396) ∂V ∂xi

this can be written (397)

See Appendix B ‘Odds and ends’ note 9.

99

May 11, 2004 5:16pm

15

CARTESIAN TENSORS

Thus ∂i V is a vector. It is known as the gradient of V . ∂ i are the components of the vector differential operator ∇. 18 An alternative notation is V,i = ∂i V =

∂V ∂xi

(398)

The comma notation is used mainly in general tensors. In general if Ti...j...k is a tensor of rank r then ∂a Ti...j...k is a tensor of rank r + 1. The proof is similar to that given above with application of the chain rule introducing an extra coefficient aij into the transformation law. Similarly the partial derivative of a tensor density of rank r with respect to x i is a tensor density of rank r + 1. If Ti is a vector field then ∂j Ti is a tensor of rank 2. This can be contracted to give a scalar ∂i Ti known as the divergence of Ti .

15.8 15.8.1

Vectors in E3 Dot product

The dot (or scalar) product of two vectors A and B is A · B = A i Bi

(399)

This inner product is a scalar. We can define the angle θ between two vectors as follows

where A =

√

A · B = A B cos θ

Ai Ai and B =

(400)

√ Bi Bi are the lengths of the vectors Ai and Bi respectively.

If A · B = 0 then the (non-zero) vectors are said to be orthogonal. 15.8.2

Cross product

Let Ai and Bi be two vectors. Then Tjk = Aj Bk − Ak Bj

(401)

is a skew-symmetric tensor of rank 2. In E3 it has 3 non-zero independent components since T 11 = T22 = T33 = 0 and T12 = −T21 , T13 = −T31 and T23 = −T32 . It is natural to associate a vector with these 3 components and this can be done in the following way. 18

See Appendix B ‘Odds and ends’ note 12.

100

May 11, 2004 5:16pm

15

CARTESIAN TENSORS

The cross (or vector) product of two vectors A and B is (A × B)i = eijk Aj Bk

(402)

This is a vector density. We can write out the components of as (A × B)1 = A2 B3 − A3 B2

(403)

(A × B)2 = A3 B1 − A1 B3

(A × B)3 = A1 B2 − A2 B1

These agree with the usual definition of vector product. The presence of e ijk is suggestive of the determinantal definition of vector product. The vector product behaves like a vector for proper transformations but a minus sign is introduced for improper transformations (reflections). A vector like this is sometimes known as an axial vector or a pseudo-vector. 15.8.3

Curl

By analogy we can define the curl of a vector field B i as the vector density (∇ × B)i = eijk ∂j Bk

(404)

We can write out the components as (∇ × B)1 = ∂2 B3 − ∂3 B2

(405)

(∇ × B)2 = ∂3 B1 − ∂1 B3

(∇ × B)3 = ∂1 B2 − ∂2 B1

These agree with the usual definition of curl. 15.8.4

Summary

In summary we have ∂i V ∂i Ai eijk ∂j Ak Ai Bi eijk Aj Bk 15.8.5

= (∇V )i = (grad V )i = ∇·A = div A = (∇ × A)i = (curl A)i = A·B = (A × B)i

(406)

Vector identities

The following results allow us to prove vector identities in a straightforward way. eijk elmk = (δil δjm − δim δjl )

(407)

eijk eljk = 2 δil

eijk eijk = 6 Care needs to be taken with ∂i which is a differential operator and cannot be moved about freely in expressions. 101

May 11, 2004 5:16pm

15

CARTESIAN TENSORS

1. The scalar triple product A · B × C can be written as eijk Ai Bj Ck

(408)

It is clear from the property of the permutation symbol that the product is cyclic in the vectors i.e. , A · B × C can be written as B · C × A or C · A × B . 2. Consider the vector triple product A × (B × C). As this is a vector quantity [A × (B × C)]i = eijk Aj (B × C)k

(409)

= eijk Aj eklm Bl Cm

= eijk elmk Aj Bl Cm = (δil δjm − δim δjl ) Aj Bl Cm = (Am Bi Cm − Al Bl Ci )

= [(A · C)B − (A · B)C]i

3. Consider ∇ × (f A) where f and A are scalar and vector fields respectively. [∇ × (f A)]i = eijk ∂j (f Ak )

(410)

= eijk (∂j f ) Ak + f eijk ∂j Ak = [(∇f ) × A + f (∇ × A)]i

15.9

Rigid body motion

An important application of Cartesian tensors is to the motion of a rigid body. This is closely related to motion in rotating frames of reference. 15.9.1

Space and body axes

Consider the motion of a rigid body with one point fixed. Then the position of any point in the body is given by the coordinates x i referred to a set of Cartesian axes fixed in space. We choose the origin of these axes to be the fixed point of the body. The distance of any point in the body from this origin xi xi is invariant no matter how the body moves. This is the definition of the rigid body. At time zero we define a set of Cartesian axes fixed in the body which coincides with the space axes. The position of any point in the rigid body is always the same in the body axes (again the definition of a rigid body). The coordinates of a point P in the rigid body referred to the body axes are yi with yi yi invariant. The relationship between the coordinates must preserve the distance squared of each point from the origin and is yi = aij xj

(411)

where aij are the elements of an orthogonal matrix. The motion of the body can therefore be described in terms of an orthogonal matrix which depends on time i.e. a ij = aij (t). We 102

May 11, 2004 5:16pm

15

CARTESIAN TENSORS

know that the orthogonal transformation has 3 independent parameters. The motion can then be described completely in terms of 3 time-dependent parameters. The orthogonal transformation is proper det(A) = 1 since a ij must be continuous with δij its value at time zero. 15.9.2

Euler’s theorem

Euler’s Theorem states that the general displacement of a rigid body with one point fixed is a rotation about some axis. We have shown that the general displacement of a rigid body with one point fixed corresponds to a proper orthogonal transformation. For any proper transformation A in E3 we know that A has an eigenvalue of 1. Therefore there is a vector n such that An = n i.e. particles lying along this vector are unaffected by the displacement. In other words this is an axis of rotation. Since yi is independent of time it follows that y˙ i = 0 = aij x˙ j + a˙ ij xj

(412)

where x˙ denotes a derivative of x with respect to time. Take the inner product with a ik to obtain x˙ k = −aik a˙ ij xj

(413)

= −Ωkj xj

where Ωkj is a Cartesian tensor of rank 2. We can show that it is skew-symmetric Ωkj + Ωjk = aik a˙ ij + aij a˙ ik d(aik aij ) = dt dδkj = dt = 0

(414)

Therefore it has 3 independent components and we can define the axial vector ω i in the following way ωi =

1 2

eijk Ωjk

(415)

with Ωij = eijk ωk

(416)

The elements of Ω are  

0

Ω=  −ω3 ω2

ω3 0 −ω1 103

−ω2

 

ω1   0

May 11, 2004 5:16pm

15

CARTESIAN TENSORS

Therefore x˙ k = −Ωkj xj

(417)

= −ekja ωa xj = (ω × r)k

where ω is the angular velocity of the body. This can be written as vS = ω × r

(418)

where v S is the velocity of the point in the space axes. 15.9.3

Rotating frames of reference

Consider any vector T . It can be written as = T i ei

T

(419)

= t i di where ei are the fixed unit vectors of the space axes and d i are the unit vectors in the rotating body axes. Consider the rate of change of T

dT dt

= T˙i ei

(420)

S

= t˙i di + ti d˙ i dT + ti d˙ i = dt B

since both ti and di depend on time. We know that ti = aij Tj and di = aij ej . Then ti d˙ i = aij Tj a˙ ik ek

(421)

= Ωjk Tj ek = ejki ωi Tj ek = ek ekij ωi Tj = ω×T The final result is

dT dt

= S

dT dt

B

+ω×T

(422)

For the case of a point in the rigid body we have v B = 0 so that vS = w × r

(423)

as before. 104

May 11, 2004 5:16pm

15

CARTESIAN TENSORS

15.9.4

Coriolis and centrifugal forces

We can apply this to a body which is rotating uniformly such as the earth. Consider the motion of a particle with position r. Then vS = vB + ω × r

(424)

The acceleration is given by aS

dv S = dt S dv S = + ω × vS dt B = aB + 2(ω × v B ) + ω × (ω × r)

(425)

Newton’s laws apply in the inertial frame which corresponds to the space axes. Therefore F = maS

(426)

referred to the body axes this becomes F eff = F − 2m(ω × v B ) − mω × (ω × r) = maB

(427)

so that the motion of the frame relative to the inertial frame can be compensated for by introducing the fictitious forces known as the Coriolis force (which depends on the velocity of the particle) and the centrifugal force. Note that the centrifugal force acts in the direction of r so that it is an outward force. 15.9.5

Angular momentum of rigid body

The angular momentum l of a particle with position r and linear momentum p is given by r × p or in component form li = eijk xj pk

(428)

= m eijk xj x˙ k where m is the particle mass. If the particle belongs to a rigid body with one point fixed then li = −m eijk xj Ωka xa

(429)

= −m eijk xj ekab ωb xa

Noting the inner product of the permutation symbols this reduces to li = m (xa xa ωi − xa xi ωa )

(430)

= m (xa xa δib − xi xb ) ωb = ιib ωb

105

May 11, 2004 5:16pm

15

CARTESIAN TENSORS

where ιib is a symmetric tensor of rank 2. The total angular momentum L of the rigid body is obtained by summing over all the particles contained in the body Li =

X

li

(431)

= Iij ωj where Iij

X

=

X

=

ιij

(432)

m (xa xa δij − xi xj )

is the inertia tensor, a symmetric tensor of rank 2. This is a tensor equation and holds for any set of axes. We can select the most convenient set e.g. axes in which the inertia tensor is diagonal and time-independent (a body set of principal axes). Since the inertia tensor is symmetric we can always find an orthogonal transformation ( i.e. a set of coordinates) which will make it diagonal. 15.9.6

Kinetic energy of rigid body

The kinetic energy t of a particle is given by t=

1 2

m x˙ j x˙ j

(433)

where m is the particle mass and x˙ j is the particle’s velocity. If the particle belongs to a rigid body with one point fixed then t = =

1 2 1 2

m Ωja xa Ωjb xb

(434)

m ejac ωc xa ejbd ωd xb

Noting the inner product of the permutation symbols this reduces to t = = =

1 2 m (xa xa ωc ωc − xa xb ωa ωb ) 1 2 m (xi xi δab − xa xb ) ωa ωb 1 2 ιab ωa ωb

(435)

The total kinetic energy T of the rigid body is obtained by summing over all the particles contained in the body T

= =

X 1 2

t

(436)

Iij ωi ωj

106

May 11, 2004 5:16pm

16

SPECIAL RELATIVITY

16 16.1

Special relativity Introduction

The laws of mechanics due to Newton 19 were accepted without reservation for more than two centuries. All observation agreed with the laws. Deviations from Newtonian mechanics only occur for particles which approach the speed of light. Before 1900 there was no reason to doubt Newtonian mechanics. It was only at this time that β-rays moving at speeds close to light were first detected. Newton defined an infinite set of inertial frames in which the laws hold. He assumed absolute space and time and that the geometry of space is Euclidean. Newtonian relativity relates these inertial frames. The laws of mechanics are invariant under Galilean transformations. Electromagnetic theory is based on the equations due to Maxwell 20 . Using these equations, Maxwell predicted that electromagnetic waves travel through space at the same speed as light. He asserted that light was an electromagnetic phenomenon. The ether was the medium for light waves and this was identified with Newton’s absolute space. However Maxwell’s equations are not invariant under Galilean transformations. They are invariant under Lorentz transformation as discovered by Lorentz 21 before Einstein published his special theory of relativity. The Michelson-Morley experiment (1887) failed to detect movement of the earth in the ether. Since it cannot be detected, the ether does not exist. Einstein 22 published his special theory of relativity in 1905. This theory is founded on the Special Principle of Relativity (the laws of physics are the same in all inertial frames) and the light postulate (the speed of light in a vacuum is the same in all inertial frames). The light postulate is an experimental observation. In special relativity coordinates (both space and time) should be transformed using the Lorentz transformation and the laws of physics are invariant under Lorentz transformations. A modification of Newton’s laws of mechanics was required. This in effect unifies mechanics and electromagnetism. Einstein’s motivation was not to fix up classical mechanics (physicists did not believe it was wrong) but rather a desire to treat all physical phenomena in the same way. Modern accelerators produce particles whose behaviour agrees with special relativity but differs radically from Newtonian predictions. An important consequence of the change in the laws of mechanics was the equivalence of energy and mass. A further consequence of special relativity is that time is no longer absolute. Minkowski 23 formulated special relativity in terms of a 4-dimensional continuum. This 4-dimensional space represents space-time and clearly shows the relationship between space and time. These ideas were crucial to Einstein’s general theory of relativity in 1916. 19

See See 21 See 22 See 23 See 20

Appendix Appendix Appendix Appendix Appendix

B ‘Odds and ends’ note 13. C for biography. B ‘Odds and ends’ note 14. D for biography. B ‘Odds and ends’ note 15.

107

May 11, 2004 5:16pm

16

SPECIAL RELATIVITY

With a suitable choice of coordinates the Lorentz transformation in Minkowski’s space-time becomes an orthogonal transformation in a Euclidean 4-dimensional space. The equations of physics must be invariant with respect to Lorentz transformations and it is natural to use a Cartesian tensor formulation. The development of quantum mechanics in 1926 was prompted by de Broglie’s 24 ideas of wave/particle duality in 1923. Special relativity provided the essential framework for these ideas. It should not be forgotten that special relativity is a theory. It has stood up to experimental testing so far but it is likely that it will be modified in the future. Every theory is only a model for some part of nature. It has the following characteristics: simplicity or beauty, internal consistency, compatibility with other scientific ideas and good agreement with experiment. Another is the possibility of experimental disproof.

16.2

Newtonian mechanics

Newton’s three laws of mechanics are: 1. Free particles move with constant velocity. 2. The force F on a particle is proportional to its acceleration f F = mf

(437)

where m is the inertial mass, a measure of the particle’s resistance to acceleration. This is more than a definition of force since we have other laws such as Coulomb’s law, Hooke’s Law etc. that define force. 3. The forces of action and reaction are equal in magnitude and opposite in direction. These laws apply in inertial frames. A rigid frame is inertial if free particles move without acceleration relative to it i.e. the first law is a test for an inertial frame. The geometry of space is assumed to be Euclidean. Time is assumed to be absolute.

16.3

Galilean transformation

The Galilean transformation relates inertial frames. If S (coordinates r, time t) is an inertial frame and another frame S (coordinates r, time t) moves with uniform velocity u relative to S then the Galilean transformation is: r = r − ut

(438)

t = t

24

See Appendix B ‘Odds and ends’ note 16.

108

May 11, 2004 5:16pm

16

SPECIAL RELATIVITY

It follows then that velocity v and acceleration f transform as v = v−u

f

(439)

= f

and Newton’s 2nd law is unchanged in frame S. The laws are invariant under Galilean transformations. Any frame in uniform relative motion to this frame is also an inertial frame. Newton’s laws cannot be used to distinguish between these frames.

16.4

Principle of special relativity

Special Principle of Relativity The laws of physics are the same in all inertial frames. Light Postulate The speed of light in a vacuum is the same in all inertial frames.

16.5

Lorentz transformation

The light postulate enables us to identify the Lorentz transformation as the correct way to transform between inertial frames. S and S are inertial frames. S moves uniformly with speed u in the x-direction relative to S. Note that the axes in the two frames remain parallel. (x, y, z) are Cartesian coordinates measured in S and t is the time measured by a clock at rest in S. Similarly (x, y, z) and t are the space and time coordinates measured in S. We do not assume that time is absolute i.e. t = t. At time t = t = 0 the origins coincide ( i.e. the axes coincide). At this time a light pulse is emitted at the common origin. Observers at the origin of the two frames both see a spherical wave spreading out with speed c. The equations for the wave-fronts are: s2 = −c2 t2 + x2 + y 2 + z 2 = 0 s

2

2 2

2

2

2

= −c t + x + y + z = 0

in S

(440)

in S

We want to find a transformation between (t, x, y, z) and (t, x, y, z) such that s 2 = 0 implies s2 = 0. The transformation is linear. This follows (non-trivially) from the definition of inertial frames. Only under a linear transformation can the linear equations of motion of free particles in S go over into linear equations of motion in S. Linearity implies that s2 = ks2 where k is independent of the coordinates. By symmetry s2 = ks2 = k 2 s2 and so k 2 = 1 giving k = 1. You cannot have k = −1 because as u → 0 we must recover s2 = s2 . We are therefore looking for a transformation which preserves s 2 . s2 = s 2

(441)

109

May 11, 2004 5:16pm

16

SPECIAL RELATIVITY

By symmetry y = y and z = z so that 2

−c2 t2 + x2 = −c2 t + x2

(442)

Assume a linear relationship between (x, t) and (x, t). x = γ (x − ut)

(x = 0 when x = ut)

(443)

(x = 0 when x = −ut)

x = γ (x + ut) We can eliminate x to obtain

t=γ t−

x 1 1− u γγ

(444)

which is a function of x and t. Therefore

−c2 t2 + x2 = −c2 γ 2 t −

1 x 1− u γγ

2

+ γ 2 (x − ut)2

(445)

This holds for all values of x and t. Therefore we can equate the coefficients of x 2 , t2 and 2xt to give: 1 2 c2 γ 2 1 − u2 γγ 2 2 2 2 −c = γ (u − c ) c2 γ 2 1 2 0 = −γ u + 1− u γγ

1 = γ2 −

(446)

From these equations it follows that γ = γ

(447) 1

=

q

The Lorentz transformation is then:

1−

u2 c2

ux c2 x = γ (x − ut) t = γ

t−

(448)

(449)

y = y z = z

The inverse transformation is ux c2 x = γ (x + ut) t = γ

t+

y = y z = z The inverse transformation can be obtained by 110

May 11, 2004 5:16pm

16

SPECIAL RELATIVITY

• reversing the direction of relative motion, u → −u • interchanging barred and unbarred coordinates. The observer at rest in S appears to the observer at rest in S to be moving with speed u in the positive x-direction. From the viewpoint of the observer at rest in S the observer at rest in S is moving with speed u in the negative x-direction. Thus the inverse transformation is consistent.

16.6

Lorentz factor

Consider γ=q

1

(450)

u2 c2

1−

When u = 0 then γ = 1. When u > 0 then γ > 1. Clearly u cannot be greater than c or we would obtain imaginary values for γ i.e. this Lorentz transformation is undefined. The speed of light is given by c = 2.99792458 × 10 8 ms−1 . u c

γ

.0 1.

.1 1.005

.2 1.021

.5 1.155

.7 1.400

.866 2

.9 2.294

.95 3.204

.99 7.089

.9999 70.712

When u c we can take γ = 1 and then the Lorentz transformation becomes t = t

(451)

x = x − ut y = y z = z which is the Galilean transformation eq.(438). These are the transformation equations with which we are familiar. A speed of 1000 miles/hour (447 ms −1 ) corresponds to u/c ≈ 10−6 . Rearranging the Lorentz factor gives 2

u c

16.7

=

γ2 − 1 γ2

(452)

Vector form of Lorentz transformation

We have considered only a Lorentz transformation corresponding to speed u in the xdirection. Using this we can obtain an expression for a Lorentz transformation corresponding to the general case of a relative velocity u. Let r = xi + yj + zk

(453)

r = xi + yj + zk 111

May 11, 2004 5:16pm

16

SPECIAL RELATIVITY

We need to sort out the components of r and r parallel and perpendicular to u. Then since x = r · i the x-equation becomes r · i = γ (r · i − ut)

(454)

We know that the y and z components are unchanged by the Lorentz transformation and this is expressed by the following vector equation. r − (r · i)i = r − (r · i)i

(455)

Also the equation for time can be written as t=γ

t−

u (r · i) c2

(456)

Now let u = ui to obtain the general expression (by substituting the x-equation into the vector equation for y and z) u·r t = γ t− 2 c γ−1 (u · r) − γt u r = r+ u2

In γ, see eq.(450), we use u =

q

(457)

u2x + u2y + u2z .

The non-relativistic limit is obtained when speeds are small compared to c i.e. u c. In this limit it is clear that the Galilean transformation eq.(438) is recovered.

16.8

Transformation of time- and space-intervals

Since the Lorentz transformation is linear, it follows that it also holds for time- and spaceintervals — and also for differentials. Therefore if S and S are inertial frames such that S moves relative to S with speed u in the x-direction then time- and space-intervals transform according to the Lorentz transformation u ∆x c2 ∆x = γ (∆x − u∆t) ∆t = γ

∆t −

(458)

∆y = ∆y ∆z = ∆z

q

where γ = 1/ 1 −

u2 c2 .

We define the proper time-interval, ∆τ , to be the time-interval measured in the rest frame of the clock.

112

May 11, 2004 5:16pm

16

SPECIAL RELATIVITY

16.9

Velocity transformation

Consider the motion of particle with position (x, y, z) functions of t as measured by observer S. Another observer S measures position (x, y, z) functions of t. S moves with speed u relative to S in the x–direction. The x–component of velocity as measured by S is v x given by vx =

dx dt

(459)

vx =

dx dt

(460)

while for S it is v x given by

Therefore vx = = = = Similarly in the transverse directions vy vy = γ 1 − ucv2x

dx dt dx dt dt dt γ (vx − u)

γ 1−

vx − u 1 − ucv2x and

u vx c2

(461)

vz =

vz

γ 1−

u vx c2

(462)

The inverse transformation vx =

vx + u 1 + ucv2x

(463)

is the famous ‘addition of velocities’ formula. The speed of the particle relative to S is the ‘sum’ of the speed of S relative to S and the speed of the particle relative to S. The non-relativistic limit is the familiar vx = v x + u

(464)

The relativistic formula ensures that v x can never exceed c.

16.10

Time dilation

If S and S are inertial frames such that S moves relative to S with speed u in the x-direction then time- and space-intervals transform according to the Lorentz transformation u ∆t = γ ∆t − 2 ∆x c ∆x = γ (∆x − u∆t)

113

(465)

May 11, 2004 5:16pm

16

SPECIAL RELATIVITY q

where γ = 1/ 1 −

u2 . c2

The inverse is

u ∆t = γ ∆t + 2 ∆x c ∆x = γ (∆x + u∆t)

(466)

We now suppose that we are measuring time-intervals on a clock that is at rest in S. Then ∆x = 0 and the time interval ∆t is the proper time, denoted ∆τ . Therefore we obtain from the above equations ∆t = γ ∆τ

and

∆x = u γ ∆τ

(467)

u2 ∆t c2

(468)

From the first equation we have ∆τ

=

s

1−

≤ ∆t A clock moving uniformly with speed u through an inertial frame goes slow by a factor q 1−

u2 c2

relative to a stationary clock in the frame. This effect is known as time dilation.

Time dilation is a real effect. Two examples follow.

1. An example of time dilation is the decay of muons. A muon is a lepton, the same family of particles as the electron. Like an electron it has spin 12 and charge −1 but it is about 210 times heavier than an electron and is unstable with a proper life-time of T = 2 × 10−6 sec. This would be as measured in the laboratory. Muons are observed in cosmic radiation at the surface of the Earth. They are formed in the upper atmosphere when cosmic radiation (such as the solar wind) scatters from the atoms in the atmosphere. Their speeds can approach the speed of light. The decay depends on probability (through quantum theory) and the life-time is an average time such that for times much greater than this most of the particles will have decayed. According to non-relativistic theory they should travel about 600m (cT = 3 × 108 × 2 × 10−6 ) before decaying. However they are observed to travel 10,000m which indicates a life-time of more than 10 times the proper life-time. This effect is due to time dilation — moving processes run slower.

2. The Global Positioning System (GPS) depends on synchronisation between the clocks on a satellite and at the surface of the Earth. One of the major corrections is due to time dilation resulting from the speed of the satellite. GPS is discussed in Appendix G.

16.11

Causality

We have seen that c provides an upper limit to the speed of a particle or in general to the transfer of information between two events. An event is a happening at a particular position 114

May 11, 2004 5:16pm

16

SPECIAL RELATIVITY

and time, so it characterised by the four coordinates (x, y, z, t). We will now consider cause and effect or causality. This could be problematic because time is no longer absolute. In S a signal sent by P arrives at Q after time ∆t > 0. If we can find a Lorentz transformation to a frame S such that ∆t < 0 then in this frame the message is received at Q before P sends it. This allows the possibility of Q then sending a message to P. It would be possible for P to receive a message saying that he has sent a message to Q before he has sent it. P would know his future. This is contradictory. u∆x ∆t = γ ∆t − 2 c uU = γ∆t 1 − 2 c

(469)

where U = ∆x ∆t is the speed of the message. Since the speeds U and u must be less than or equal to c it follows that when ∆t > 0 we must always have ∆t > 0. In order to preserve causality we must have c as the upper speed limit.

16.12

Lorentz-Fitzgerald contraction

The Lorentz-Fitzgerald 25 contraction was originally introduced to explain the null result of the Michelson-Morley experiment. Consider a rod at rest in a frame S. The length of the rod is L o . This is known as the proper length. ‘Proper’ is often used to describe a property that is measured in the rest frame. We want to determine the length L as observed in a frame S that is moving relative to S. If S and S are inertial frames such that S moves relative to S with speed u in the x-direction, then time- and space-intervals transform according to the Lorentz transformation u ∆x c2 ∆x = γ (∆x − u∆t) ∆t = γ

q

where γ = 1/ 1 −

∆t −

(470)

u2 c2 .

Suppose the rod is lying on the x-axis. Since ∆x = L o and ∆x = L when ∆t = 0 we have Lo = γL

(471)

or L =

s

1−

u2 Lo c2

(472)

≤ Lo 25

See Appendix B ‘Odds and ends’ note 17.

115

May 11, 2004 5:16pm

16

SPECIAL RELATIVITY

Therefore the moving rod appears contracted. Note the rod is moving in frame S. If you are measuring its length in this frame then it is only possible if the position of the end-points x 1 and x2 are measured at the same time t = t1 = t2 . It follows that ∆t = t2 − t1 = 0. What about ∆t? This does not matter. The rod is at rest in S so that the position of the end-points can be measured at any times.

16.13

Length paradox

This an example of one of the many paradoxes that have been discussed in relativity. How do you get a 20m pole into a 10m shed? If the pole moves at u = 0.866c then the Fitzgerald contraction is by a factor 2. It will then fit into the shed. It hits the concrete wall and is brought to rest. This is a transformation in space-time and the pole attempts to resume its original length. Consider the symmetric situation in which the pole is at rest and the shed moves with speed u = 0.866c. Relativity states that the pole should again fit into the shed. It should not matter which frame of reference you use. The paradox is that now the length of the shed is 5m while the pole is 20m. The paradox is resolved by considering the transfer of information. When the end of the pole hits the concrete wall, the shed keeps moving and this end of the pole must go with it. A shock wave is set up in the pole. However the other end of the pole does not know that this has happened until the information reaches it. Suppose the information ( i.e. shock wave) travels at the speed of light. It cannot travel faster. Then the time taken to reach the end of the pole is 20 c . The front of the shed moves with speed 15 15 u. It will reach the end of the pole in a time u . There is equality if uc = 20 = .75. But the shed is moving faster than this. Therefore in this case the pole will more than fit into the shed and the paradox is resolved. But what happens if you take the wall away?

116

May 11, 2004 5:16pm

17

MINKOWSKI SPACE-TIME

17

Minkowski space-time

17.1

Line-element

The tensor formulation of special relativity follows from the invariance of s2 = −(ct)2 + x2 + y 2 + z 2

(473)

(ds)2 = − (c dt)2 + (dx)2 + (dy)2 + (dz)2

(474)

In terms of differentials

and we recognise this as a line-element similar to Cartesian coordinates in a Euclidean space. However there is a difference in the sign of the first term on the rhs. Such a space is called pseudo-Euclidean. Another difference is that the line-element can be negative, zero or positive. This is not a positive-definite metric.

17.2

4-position

Minkowski in 1908 introduced the concept of a 4-dimensional space-time. The coordinates in this space-time are xµ with (µ = 0, 1, 2, 3), known as the 4-position. x0 = ct

x1 = x

x2 = y

x3 = z

(475)

xµ = (ct, r)

(476)

Greek letters are often used as indices when referring to tensors in 4 dimensions.

17.3

Metric tensor

The metric tensor is

26

   

(ηµν ) = 

−1 0 0 0

0 1 0 0

0 0 1 0

0 0 0 1

    

(477)

Note the distribution of signs, known as the signature, in the metric tensor. This is not the same as the Kronecker delta of Cartesian coordinates in a Euclidean space. We use η µν rather than gµν because of the special form that it takes in special relativity. η µν is called the Lorentz or flat space-time metric. g µν is used in general relativity, the g referring to gravity of course. ηµν is the covariant metric. Its inverse, which is identical to it, is written as η µν . This is the contravariant metric. The line-element can be written as (ds)2 = ηµν dxµ dxν 26

(478)

In some textbooks the distribution of signs is (1, −1, −1, −1)

117

May 11, 2004 5:16pm

17

MINKOWSKI SPACE-TIME

17.4

Coordinate transformations

We allow only coordinate transformations that preserve the metric. xµ −→ xµ

⇒

η µν = ηµν

(479)

In other words the form of the line-element is preserved. This resembles the case of Cartesian tensors in which the required transformation is orthogonal. It is different due to the signature of the metric. Consider the following linear, homogeneous transformation of coordinates where the coefficients are independent of coordinate xµ = Λ µ ν xν

x µ = Λ ν µ xν

and

(480)

where (Λν µ ) is the inverse of (Λµν ) Λ µα Λ ν α = Λ α µ Λ α ν =

(481)

δνµ

The quantities Λµν and Λν µ are different and you must be careful to distinguish between them by off-setting the indices.

17.5

Transformation of tensors

A contravariant 4-vector transforms as follows µ

A = Λ µν A ν

(482)

A covariant 4-vector transforms as follows A µ = Λ µν A ν

(483)

The µ = 0 component of a 4-vector is known as the time component while (µ = 1, 2, 3) are the space components. 4-position xµ , 4-displacement (the difference of two 4-positions) ∆x µ and 4-differential dxµ are all 4-vectors. This follows from the linear, homogeneous nature of the coordinate transformation. The extension to higher rank tensor quantities is obvious: ω

T µν = Λµα Λν β Λω Tαβ

118

(484)

May 11, 2004 5:16pm

17

MINKOWSKI SPACE-TIME

17.6

Raising and lowering indices

The metric can be used to raise and lower indices. The contravariant and covariant components of a 4-vector are not the same. Aµ = (A0 , A)

Aµ = ηµν Aν = (−A0 , A)

−→

(485)

Consider raising and lowering indices for a rank 2 tensor. Tν µ = ηνα T αµ µα

(486)

T µν

= T

Tµν

= ηµα T αβ ηβν

ηαν

Using    

(T µν ) = 

T 00 T 10 T 20 T 30

T 01 T 11 T 21 T 31

T 02 T 12 T 22 T 32

T 03 T 13 T 23 T 33

    

(487)

we find, using the superscript to label the row and the subscript to label the column for the mixed tensors, 

−T 00 −T 01 −T 02 −T 03

T 10 T 11 T 12 T 13

T 20 T 21 T 22 T 23

T 30 T 31 T 32 T 33



(488)



−T 00 −T 10 −T 20 −T 30

T 01 T 11 T 21 T 31

T 02 T 12 T 22 T 32

T 03 T 13 T 23 T 33



(489)

  

(Tν µ ) = 

  

(T µν ) =     

(Tµν ) = 

   

   

T 00 −T 01 −T 02 −T 03 −T 10 T 11 T 12 T 13 −T 20 T 21 T 22 T 23 30 31 32 −T T T T 33

    

(490)

If T µν is symmetric or skew-symmetric then so is T µν . If T µν is symmetric then T µν = Tν µ and we do not need to offset the indices in the mixed tensor. Also Tµ µ = T µµ = −T

(491) 00

+T 119

11

+T

22

+T

33

May 11, 2004 5:16pm

17

MINKOWSKI SPACE-TIME

17.7

Inner product

The inner product of two 4-vectors is a scalar quantity and can be written as ηµν Aµ B ν

= A µ Bµ = Aµ B

(492)

µ

= η µν Aµ Bν = −A0 B 0 + A · B The ‘length squared’ of a 4-vector is ηµν Aµ Aν

= A µ Aµ = η

µν

(493)

Aµ Aν 0

= −A A0 + A · A and need not be positive. We need to be careful in using the concept of length.

17.8

Summary

• Greek letters are used for the indices. • The indices range from 0 to 3; with 0 corresponding to time and 1,2,3 to space. • ηµν is the Lorentz metric. It is diagonal with elements (−1, 1, 1, 1). • 4-position is xµ = (ct, r). • 4-vector Aµ = (A0 , A) has covariant components Aµ = (−A0 , A) and ‘length squared’ Aµ Aµ = −A0 A0 + A · A.

17.9

Lorentz transformations

Lorentz transformations preserve the line-element, so that (ds)2 = (ds)2

(494)

(ds)2 = ηαβ dxα dxβ = ηαβ Λ

α

µ

µ

dx Λ

(495) β ν

dx

ν

We can equate this to (ds)2 = ηµν dxµ dxν

(496)

Therefore h

i

(497)

120

May 11, 2004 5:16pm

ηαβ Λαµ Λβ ν − ηµν dxµ dxν = 0

17

MINKOWSKI SPACE-TIME

Since dxµ is arbitrary and the quantity in the square brackets is symmetric we have ηαβ Λαµ Λβ ν = ηµν

(498)

These are the Lorentz conditions. They are analogous to the orthogonality conditions for Cartesian tensors which can be written as δij aik ajl = δkl

(499)

where (aij ) is an orthogonal matrix and a summation is implied between repeated subscripts. Multiplying the Lorentz conditions by the inverse transformation you can show that Λµν = η µα ηνβ Λαβ

Λν µ = η µα ηνβ Λβ α

and

(500)

The Lorentz transformation Λµν is not a tensor but the notation for the inverse Λ ν µ is suggested by the raising and lowering of indices procedure. Written as a matrix the inverse is    

Λν µ = 

Λ00 −Λ10 −Λ20 −Λ30 −Λ01 Λ11 Λ2 1 Λ 31 2 1 0 Λ 32 Λ2 −Λ 2 Λ 2 0 1 2 −Λ 3 Λ 3 Λ3 Λ 33

    

(501)

In the 3 × 3 space-space part of the matrix we can recognise the matrix transpose which is characteristic of an orthogonal matrix. However in the first row and column the sign change is associated with the time part of the metric. The Lorentz conditions can be written in terms of the inverse transformation as ηαβ Λµα Λν β = ηµν

(502)

The Lorentz conditions eq.(498) can be written in matrix form as

Λ αµ

T

(ηαβ ) Λβ ν

= (ηµν )

(503)

Taking the determinant of both sides and remembering that det(AB) = det(A) det(B) and det(A)T = det(A) gives h

so that det(Λµν ) = ±1

det(Λµν )

i2

=1

(504)

Also let µ = ν = 0 in eq.(498)

− Λ 00

2

+ Λ1 0

2

ηαβ Λα0 Λβ 0

= η00

2

= −1

+ Λ 20 121

+ Λ3 0

2

(505)

May 11, 2004 5:16pm

17

MINKOWSKI SPACE-TIME

Therefore

Λ0 0

2

= 1 + Λ1 0 ≥ 1

2

+ Λ 20

2

+ Λ3 0

2

(506)

so that Λ00 ≥ 1 (called orthochronous) or Λ00 ≤ −1. An orthochronous transformation preserves the direction of time since ct = Λ00 ct + · · ·

(507)

Only the orthochronous transformations are continuous with the identity transformation. This means that the transformation can be generated from the identity transformation by a sequence of infinitesimal transformations e.g. the transformation depends on parameters that can be continuously varied. Proper or restricted Lorentz transformations have det(Λ µν ) = 1 and Λ00 ≥ 1. All other Lorentz transformations are improper.

17.10

Boost

A Lorentz transformation corresponding to uniform motion of the inertial frames is known as a boost. It is sometimes referred to as a rotation in space-time. For uniform motion with speed u along the x-axis, ux t = γ t− 2 c x = γ (x − ut)

(508)

y = y z = z

where γ=q

Write it as

1 1−

(509)

u 2 c

u ct = γ ct − x c u x = γ − ct + x c y = y

(510)

z = z In matrix form we have     

ct x y z





    =  

γ −γ uc 0 0

−γ uc γ 0 0 122

0 0 1 0

0 0 0 1

    

ct x y z

    

(511)

May 11, 2004 5:16pm

17

MINKOWSKI SPACE-TIME

This is in the form x µ = Λ µν x ν

(512)

It is straightforward to show that det(Λ µν ) = 1 and Λ00 ≥ 1. Therefore this is a proper Lorentz transformation. As u → 0, γ → 1 and the transformation matrix Λ µν becomes the identity matrix. We say that it is continuous with the identity. The inverse transformation corresponds to reversing the direction of relative motion.    

(Λν µ ) = 

γ γ uc 0 0

γ uc γ 0 0

0 0 1 0

0 0 0 1

    

(513)

In general for uniform motion with velocity u = (u x , uy , uz ) we have

µ

γ   −γ ucx 

(Λ ν ) =  uy   −γ c −γ ucz

where in γ we use u =

17.11

q

−γ ucx 2 1 + (γ − 1) uux2 u u

(γ − 1) yu2 x (γ − 1) uzuu2 x

u

−γ cy (γ − 1) uxuu2 y

u2

1 + (γ − 1) uy2 (γ − 1) uuz u2 y

−γ ucz (γ − 1) uux u2 z u u

(γ − 1) uy 2 z 2 1 + (γ − 1) uuz2

     

(514)

u2x + u2y + u2z .

Space-rotation

Λµν for a boost should be distinguished from a rotation in space only. If R is a 3 × 3 orthogonal matrix representing a proper rotation in space (excluding reflections) then    

(Λµν ) = 



1 0 0 0  0    0 R 0

(515)

is also a proper Lorentz transformation since det(R) = 1 implies det(Λ) = 1. The inverse transformation is    

(Λν µ ) = 



1 0 0 0  0    0 RT 0

123

(516)

May 11, 2004 5:16pm

17

MINKOWSKI SPACE-TIME

17.12

Lorentz group

The homogeneous Lorentz transformations form a group. The proper Lorentz transformations form a subgroup. In general a proper Lorentz transformation consists of a space rotation (no change in time) followed by a boost (time changes) from one inertial frame to another i.e. Λ(proper) = Λ(boost) Λ(space-rotation)

(517)

The product of two collinear boosts with speeds u 1 and u2 is a boost with speed u Λ(u) = Λ(u2 ) Λ(u1 )

(518)

where u=

u1 + u 2 1 + u1c2u2

(519)

This is the famous ‘addition of velocities’ formula. If the boosts are not collinear then the product is equivalent to a boost followed by a space rotation i.e. it is not a pure boost. The Lorentz conditions eq.(498) are 10 constraints on the 16 elements of Λ. Thus Lorentz transformations have 6 independent parameters. They form a 6-parameter group. For the proper subgroup these parameters can be identified as the 3 parameters needed to specify the space-rotation e.g. Euler’s angles and the 3 components of u = (ux , uy , uz ) needed to specify the boost.

17.13

Poincare group

The inhomogeneous Lorentz transformation is xµ = Λ µ ν xν + a µ

(520)

where aµ is a constant 4-vector. These transformations form a 10-parameter group known as the Poincare group. The 6-parameters of the Lorentz group are supplemented by the 4-components of aµ .

17.14

Boost in hyperbolic form

The ‘addition of velocities’ formula can be conveniently derived by using the hyperbolic form of a boost. Let u = tanh(φ) (521) c where φ is known as the boost parameter or rapidity. See Figure 8 for the hyperbolic functions. 124

May 11, 2004 5:16pm

17

MINKOWSKI SPACE-TIME

Plot of the hyperbolic functions.

4

3

2

cosh x

sinh x

1 tanh x 0

−1

−2

−3

−4 −2

−1.5

−1

−0.5

0

0.5

1

1.5

2

x

Figure 8: Hyperbolic functions Then using sech (φ) = 1/ cosh(φ) and sech 2 (φ) = 1 − tanh2 (φ) we have cosh(φ) = γ

and

sinh(φ) = cosh(φ) tanh(φ) = γ

u c

(522)

and exp(±φ) = cosh(φ) ± sinh(φ) = γ

1±

u c

(523)

Using these relations we can rewrite the Lorentz transformation corresponding to a boost of speed u in the positive x-direction as    

(Λµν ) = 

cosh(φ) − sinh(φ) − sinh(φ) cosh(φ) 0 0 0 0

125

0 0 1 0

0 0 0 1

    

(524)

May 11, 2004 5:16pm

17

MINKOWSKI SPACE-TIME

The coordinate transformation can be written as ct = ct cosh(φ) − x sinh(φ)

(525)

x = −ct sinh(φ) + x cosh(φ)

giving (ct + x) = exp(−φ) (ct + x)

(ct − x) = exp(φ) (ct − x)

and

(526)

Multiplying these two equations gives: 2

c2 t − x 2 = c 2 t2 − x 2

(527)

which is consistent with s2 being invariant. The inverse of a Lorentz transformation with rapidity φ is a Lorentz transformation with rapidity −φ since tanh(φ) is an odd function. tanh(−φ) = − tanh(φ) (528) u = − c The product of two Lorentz transformations with rapidities φ 1 and φ2 is a Lorentz transformation with rapidity φ = φ1 + φ2 . Consider Λ(φ2 ) Λ(φ1 ) =



cosh(φ2 )

 − sinh(φ )  2   0

0

− sinh(φ2 ) cosh(φ2 ) 0 0

0 0 1 0



0 0 0 1



cosh(φ1 )

  − sinh(φ )  1   0

cosh(φ1 + φ2 ) − sinh(φ1 + φ2 )  − sinh(φ + φ ) cosh(φ + φ )  1 2 1 2 =   0 0 0 0 = Λ(φ1 + φ2 )

0

0 0 1 0

0 0 0 1

− sinh(φ1 ) cosh(φ1 ) 0 0



0 0 1 0

(529)  0 0    0  1

   

We have used the hyperbolic identities sinh(φ1 + φ2 ) = cosh(φ1 ) sinh(φ2 ) + sinh(φ1 ) cosh(φ2 )

(530)

cosh(φ1 + φ2 ) = cosh(φ1 ) cosh(φ2 ) + sinh(φ1 ) sinh(φ2 ) Using the hyperbolic identity tanh(φ1 + φ2 ) =

tanh(φ1 ) + tanh(φ2 ) 1 + tanh(φ1 ) tanh(φ2 )

(531)

we can convert the rapidities into speeds of relative motion u=

u1 + u 2 1 + u1c2u2

(532)

This is the relativistic sum of the collinear speeds u 1 and u2 . When u1 u2 c2 then u = u1 + u2 as in non-relativistic mechanics. 126

May 11, 2004 5:16pm

17

MINKOWSKI SPACE-TIME

u1 = u 2 u

.5c .8c

.9c .994c

.99c .99995c

Thus it is not possible by repeated boosts to achieve a relative speed greater than c. The speed of light acts as an upper limit to speed. In the general case of a boost corresponding to relative motion with velocity u there are 3 boost parameters.

17.15

Objective of tensor formulations

The laws of physics are required to be the same in all inertial frames. By a frame we mean rigid rectangular axes (Cartesian coordinates) and a stationary clock (time). The inertial frames are related by Lorentz transformations. The property that characterises the Lorentz transformation is that the speed of light is constant. We restrict coordinate transformations to being Lorentz. Our objective is to express the laws of physics in the form of tensor equations. Then these equations will be the same in all inertial frames, thus fulfilling the Special Principle of Relativity.

17.16

Minkowski diagram

A happening at a point in space with Cartesian coordinates r at time t referred to a frame S is called an event in S. Minkowski space-time is sometimes called the world. Events are then world-points while a particle moving in space-time is said to have a world-line. xµ xµ is the square of the distance of the world point x µ from the origin. The distance between two world-points xµ and y µ is given by (∆s)2 = ∆xµ ∆xµ

(533)

2

= −(c ∆t) + (∆r)

2

where y µ = xµ + ∆xµ and ∆xµ = (c∆t, ∆r) is the displacement vector. If we take the limit of differentials then (ds)2 = dxµ dxµ 2

= −(c dt) + (dr)

(534) 2

Both ∆xµ and dxµ are 4-vectors i.e. they transform between inertial frames in the same way as xµ . Consider two events P and Q in an inertial frame S. Let P be at the origin and consider the events Q relative to it. The distance squared between the events is (∆s)2 = −(c ∆t)2 + (∆r)2

(535)

While (∆r)2 is always positive we can have (∆s)2 negative. We can classify the events Q in three ways: 127

May 11, 2004 5:16pm

17

MINKOWSKI SPACE-TIME

1. (∆s)2 > 0 space-like 2. (∆s)2 < 0 time-like 3. (∆s)2 = 0 light-like For space-like points (∆r)2 > (c ∆t)2

(536)

In this case we cannot have (∆r)2 = 0. It follows that the events P and Q are always separated in space. For time-like points (∆r)2 < (c ∆t)2

(537)

In this case we cannot have (∆t)2 = 0. It follows that the events P and Q are always separated in time. The two regions are separated by the light cone (∆s) 2 = 0. If the displacement vector lies on the light cone it is called a null vector. The world-line of a photon passing through P lies on the light cone. The world-line of a particle passing through P lies entirely within the light cone i.e. it is time-like. This follows from the particle speed being less than the speed of light. The light cone can be regarded as the history of a spherical light front which converges on P and then diverges away from it. In a Minkowski diagram, see Figure 9, one of the axes is for time. In three dimensions two of the dimensions will be for space and one for time. In this case the light front is a circle whose motion in time forms a cone. In full space-time (which we cannot draw) the light front is a sphere. Time-like points then lie within the spheres. Every event has its own light cone. The event P divides the space-time into two distinct parts corresponding to the sign of ∆t. If ∆t < 0 then the event Q is before P, in the past. If ∆t > 0 then the event Q is after P, in the future. This is related to causality. We can extend these classifications to any 4-vector, A µ . This is done by looking at the inner product Aµ Aµ . 1. Aµ Aµ > 0 space-like 2. Aµ Aµ < 0 time-like 3. Aµ Aµ = 0 light-like If A0 > 0 then the 4-vector is future-pointing. If A0 < 0 then the 4-vector is past-pointing.

128

May 11, 2004 5:16pm

17

MINKOWSKI SPACE-TIME ct

t>0 is the future

Inside light−cone is timelike relative to the origin light−cone x=ct

Outside light−cone is spacelike relative to the origin

y

x t<0 is the past The path of a particle that passes through the origin lies within the light−cone.

Figure 9: Minkowski diagram We define the proper time interval, ∆τ , between the events to be (∆τ )2 = −

(∆s)2 c2

= (∆t)2 −

(538) (∆r)2 c2

Since (∆s)2 and c2 are invariants it follows that (∆τ ) 2 is an invariant.

129

May 11, 2004 5:16pm

18

RELATIVISTIC MECHANICS

18 18.1

Relativistic mechanics 4-velocity

Consider the motion of a particle with world-point x µ (τ ) in an inertial frame S. The motion is parameterised by τ the proper time measured along the world-line. You can imagine this as the time registered by a clock carried by the particle. For a particle its 4-position xµ is time-like so that xµ xµ < 0 i.e. the particle’s world-line lies within the light-cone since it cannot move faster than the speed of light. The particle is moving with 3-velocity v = dr/dt. It is at rest in the inertial frame that moves with velocity v relative to S. The 4-velocity is defined as vµ =

dxµ dτ

(539)

Since dxµ is a 4-vector and dτ is an invariant it follows that v µ is a 4-vector. We know from the discussion of time dilation that γv dτ = dt

where

γv = q

1 1−

v2 c2

(540)

Since xµ = (ct, r) v µ = γv (c, v)

(541)

It is easy to show that v µ vµ = −c2 < 0 i.e. 4-velocity is time-like.

A simple way to show this is to choose the inertial frame in which the particle is at rest so that v µ = (c, 0). We can do this because v µ vµ is an invariant i.e. the same in all inertial frames.

18.2

Transformation of 4-velocity

Consider a Lorentz transformation corresponding to a boost u in the x-direction. 

Then we have

   

γv c γv v x γv v y γv v z





    =  

γ −γ 0 0

u c

−γ γ 0 0

u c

0 0 1 0

0 0 0 1

    

uvx γv c = γv γ c 1 − 2 c γv v x = γ γv (vx − u)

γv c γv vx γv vy γv vz

    

(542)

(543)

γv v y = γv vy γv v z = γv vz

130

May 11, 2004 5:16pm

18

RELATIVISTIC MECHANICS

It follows that

γv = γ γ v

uvx 1− 2 c

(544)

This shows how the γv factor for the particle transforms. Using this the other equations become: vx − u x 1 − uv c2 vy

vx = vy =

uvx c2

uvx c2

γ 1− vz

vz =

(545)

γ 1−

The general vector form of the velocity transformation is v=

v+

γ−1 u2

γ 1 − uc·2v

with

γv = γ γv

(546)

(547)

(u · v) − γ u

1−

u·v c2

As you would expect v is a linear combination of v and u. The inverse transformation is v=

v+

γ−1 u2

γ 1 + uc·2v

When v = 0 then v = u.

(u · v) + γ u

(548)

The above transformation equations can be obtained by analogy with the transformation of 4-position. Since 4-velocity is a 4-vector it must transform in the same way as the 4-position. Therefore consider xµ = (ct, r)

v µ = (c γv , γv v)

and

(549)

4-position transforms as u·r c2 γ−1 r = r+ (u · r) − γt u u2 t = γ

t−

131

(550)

May 11, 2004 5:16pm

18

RELATIVISTIC MECHANICS

Letting t → γv and r → γv v gives

u·v γv = γ γv 1 − 2 c γ −1 γv v = γv v + (u · v) − γ u u2

(551)

Returning to eq.(545), the case of u = ui, consider the inverse transformation vx =

vx + u x 1 + uv c2

(552)

which is just the result we obtained earlier for two successive collinear Lorentz transformations.

18.3

4-acceleration

The 4-acceleration is defined to be dv µ fµ = dτ dv µ = γv dt d (c γv , γv v) = γv dt dγv dγv = γv c , v + γv f dt dt ! 2 γv γv2 2 = γv (v · f ) , 2 (v · f ) v + f c c

(553)

where f = dv/dt is the 3-acceleration. To show this you need dγv dt

=

d dt

v2 1− 2 c

!− 1

=

v c2

v2 1− 2 c

!− 3

= =

2

2

(554) dv dt

γv3 dv v c2 dt γv3 (v · f) c2

Let f = α be the rest or proper acceleration of the particle i.e. the acceleration in the rest frame of the particle. In this frame v = 0 so that f µ = (0, α) giving f µ fµ = α 2

(555)

> 0 132

May 11, 2004 5:16pm

18

RELATIVISTIC MECHANICS

and f µ is a space-like vector.

18.4

Transformation of 4-acceleration

Consider f

= =

=

dv dt

(556) 1

γ 1 − uc·2v

dv dt 

d  1 γ 1 − uc·2v dt γ 1 − uc·2v 1

v+

γ−1 (u · v) − γ u2



u 

After some algebra this can be written as a linear combination of the vectors f, v and u 



1 u·f u·f γ−1 1 v− u f= 2 f + 2 2 u · v u·v c u γ u ·v 2 1 − 1 − γ 1 − c2 c2 c2 1

(557)

When v = 0 1 f= 2 γ

u·f γ−1 f− 2 u u γ

(558)

For a boost u in the x-direction fx = fy = fz =

1

γ3 1 −

uvx c2

1

γ2 1 −

uvx c2

uvx c2

1

γ2 1 −

which simplify further when v = 0

3 fx 3 3

(559) u (vx fy − vy fx ) c2

fy −

u fz − 2 (vx fz − vz fx ) c

fx = fy = fz =

133

1 fx γ3 1 fy γ2 1 fz γ2

(560)

May 11, 2004 5:16pm

18

RELATIVISTIC MECHANICS

18.5

4-momentum and 4-force

In looking for a relativistic formulation we are led by analogy with Newton’s second law to consider a 4-vector formulation dpµ Fµ = (561) dτ where F µ is the 4-force, pµ is the 4-momentum, τ is the proper time. The natural definition of pµ is pµ = m o v µ

(562)

where vµ is the 4-velocity and mo is the inertial mass of the particle in its rest frame. This is known as the rest or proper mass and is an invariant. dpµ dτ = mo f µ dv µ = mo dτ where we have assumed that the rest mass does not vary during the motion. Fµ =

(563)

pµ = m (c, v)

(564)

Since v µ = γv (c, v) it follows that = (mc, p) where p = m v is the Newtonian 3-momentum of the particle and m = γ v mo is the inertial mass of the particle. The inertial mass depends on the velocity and increases to infinity as v approaches c. This is not surprising as there must be some process which prevents particles from being accelerated beyond the speed of light. The variation of the mass of a particle with velocity was first observed by Kaufmann (1901) e for fast electrons emitted by Radium C. Assuming in experiments designed to measure mc that e was constant, Kaufmann was led to suggest that mass increased with velocity. Using a model based on the ether Abraham (1903) suggested that m − mo v2 = .4 2 + . . . mo c

(565)

m − mo v2 = .5 2 + . . . mo c

(566)

Relativity gives

and experimental results are much closer to the relativistic formula, which was deduced by Lorentz before Einstein’s paper in 1905. The clearest indication of the validity of relativistic mechanics is the success of the high energy accelerators developed in recent years. These are all designed using relativistic mechanics and produce particles whose speed approaches that of light. 134

May 11, 2004 5:16pm

18

RELATIVISTIC MECHANICS

18.6

Mass-energy relationship

The 4-force can be written as d (mc, p) dτ dm dp , = γv c dt dt dm = γv c ,F dt

Fµ =

(567)

where F = dp/dt is the Newtonian 3-force. Consider the 4-velocity v µ . We know that v µ vµ = −c2 so that d(−c2 ) d(v µ vµ ) = =0 dτ dτ

(568)

d(v µ vµ ) dv µ = 2vµ dτ dτ

(569)

since c is constant. But

giving vµ

dv µ =0 dτ

(570)

and dv µ =0 dτ

vµ F µ = v µ mo Using v µ = γv (c, v) and F µ = γv µ

vµ F =

c dm dt , F

γv2

−c

(571)

it follows that

2 dm

dt

+v·F

=0

(572)

Therefore v · F = c2

dm dt

(573)

v · F is the rate of doing work and this is shown here to be proportional to the rate of change of inertial mass. (work done) = (change in kinetic energy) = ∆T where T is the kinetic energy. ∆T

=

Z

= c

2

(v · F ) dt

1

2

Z

(574)

2

dm 1 2

= m 2 c − m 1 c2 135

May 11, 2004 5:16pm

18

RELATIVISTIC MECHANICS

This leads to the association of inertial mass and energy. It looks as if energy contributes to mass. Writing T = mc2 + constant

(575)

we can identify the constant by taking T = 0 and m = m o when v = 0. Then T

= mc2 − mo c2

= (γv − 1) mo c

(576) 2

The total energy of the particle E is defined to be E = mc2

(577) 2

= mo c + T i.e. (total energy) = (rest energy) + (kinetic energy) By expanding γv we obtain mc2 = mo c2 +

1 1 mo v 2 + O 2 2 c

(578)

so that in the non-relativistic limit T agrees with the Newtonian definition. The relativistic kinetic energy (γv − 1) mo c2 is larger than the non-relativistic kinetic energy m o v 2 /2. See figure 10.

Although we have only shown that energy contributes to inertial mass, Einstein went further and equated all mass with energy. This was verified later by e.g. ‘pair annihilation’ when an elementary particle and its anti-particle annihilate each other converting their mass to radiative energy. Among the interesting consequences of the mass-energy relation are: 1. light has ‘mass’ and we can expect it to bend under gravity 2. the nuclear mass defect results in energy release by fusion or fission processes 3. we can expect a gravitational field itself to gravitate. Potential energy does not contribute to mass. In Newtonian mechanics a particle moving in a field is often said to possess potential energy, so that the sum of its kinetic and potential energies remains constant. Energy conservation can also be satisfied by debiting the field with an energy loss equal to the energy gained by the particle. The real location of any part of the energy is no longer a convention since energy (like mass) gravitates. 136

May 11, 2004 5:16pm

18

RELATIVISTIC MECHANICS

14

12

10

8

6

4

2

0

0

0.1

0.2

0.3

0.4

0.5 v/c

0.6

0.7

0.8

0.9

1

Figure 10: Ratio of relativistic and non-relativistic kinetic energy (energy in units of the rest energy).

18.7

Energy-momentum relationship

We can establish a relationship between the total energy E and the 3-momentum p. Using pµ = m o v µ

(579)

= mo γv (c, v) = m (c, v) E = ,p c we have pµ pµ = m2o v µ vµ = −m2o c2

(580)

But E2 + p2 c2

(581)

E 2 = p2 c2 + m2o c4

(582)

pµ pµ = − Therefore

137

May 11, 2004 5:16pm

18

RELATIVISTIC MECHANICS

18.8

Transformation of 4-momentum

The transformation equations can be obtained by analogy with the transformation of 4position. Since 4-momentum is a 4-vector it must transform in the same way as the 4position. Therefore consider µ

x = (ct, r)

µ

and

p =

E ,p c

(583)

4-position transforms as u·r c2 (γ − 1) r = r+ (u · r) − γt u u2 t = γ

Letting t →

E c2

t−

and r → p gives E = γ (E − u · p) γ (γ − 1) (u · p) − 2 E u p = p+ u2 c

18.9

(584)

(585)

Conservation of 4-momentum

In a collision between two particles with initial 4-momenta p µ(1) , pµ(2) and final 4-momenta pµ(3) , pµ(4) we have conservation of 4-momenta i.e. pµ(1) + pµ(2) = pµ(3) + pµ(4)

(586)

p(1) + p(2) = p(3) + p(4)

(587)

This is equivalent to

E(1) + E(2) = E(3) + E(4) i.e. conservation of 3-momenta and conservation of total energy. If we assume that total energy is conserved then it follows that the 3-momenta must be conserved. This follows from the total energy being a component of a 4-vector. If a component of a 4-vector is zero in every frame then the 4-vector is identically zero also. This follows from the mixing of the components under a Lorentz transformation. Suppose the 0th component is always zero then we can find a Lorentz transformation which will bring the 3rd component to the 0th position in some frame etc. The conservation of 4-momentum can be extended to any number of interacting particles when there is no external force acting. In solving collision problems you should • carefully choose the frame of reference i.e. the directions of motion before and after the collision 138

May 11, 2004 5:16pm

18

RELATIVISTIC MECHANICS

• apply the conservation of 4-momentum • apply the energy-momentum relation • apply Lorentz transformation to coordinate system if appropriate In a collision between two particles the ‘laboratory’ frame is that in which one of the particles is initially at rest. The ‘centre-of-mass’ frame is that in which the total 3-momentum is zero.

18.10

Photons

According to quantum theory a photon has a dual personality, both wave and particle. The frequency υ and the wavelength λ of the wave are related by λ=

c υ

(588)

If we treat the photon as a particle with a 4-momentum µ

p =

E ,p c

(589)

then it has zero rest mass and the energy-momentum relation becomes E = pc

(590)

The relation between the wave and particle properties is through Planck’s constant h E = hυ

18.11

(591)

Potential energy

Some problems can be simplified by retaining the concept of potential energy. Suppose that in a particular frame the 3-force F can be written as F = −∇Φ

(592)

where Φ is a scalar function. If this can be done then the force is called conservative. From dE =v·F dt

(593)

integrate to get E =

Z

= −

(v · F ) dt Z

(594)

∇Φ · dr

= −Φ + constant 139

May 11, 2004 5:16pm

18

RELATIVISTIC MECHANICS

Therefore E + Φ is constant. We can use this to solve the motion of a particle moving in a conservative field. Examples would be a particle moving in a central potential or the simple harmonic oscillator. It is important to recognise that this is not an invariant concept. In electromagnetic theory the fields can be described by a 4-potential A µ given by Aµ = (Φ, A)

(595)

If the inertial frame is such that A = 0 then we can use potential energy. However once we transform to another frame we have A non-zero in general and the interaction cannot be represented so simply.

140

May 11, 2004 5:16pm

19

COLLISION EXAMPLES

19 19.1

Collision examples Elastic collision of two equal particles

Consider the elastic collision of two particles with the same rest mass. The problem is greatly simplified by having equal masses, the rest mass simply cancels out. In the ‘laboratory’ frame S, particle 1 is moving with speed v and particle 2 is at rest. We choose the coordinate system such that particle 1 is initially moving along the x–axis. The two vectors, initial and final velocity of particle 1, will define the x–y plane. Because of conservation of 3-momentum, the final velocity of particle 2 is also confined to the x–y plane. The ‘centre-of-mass’ frame S is such that the total 3-momentum is zero. Because of conservation of 3-momentum this is true both before and after the collision. The transformation from S to S is a boost u along the x–axis. In S the initial velocities of the two particles will be opposite in direction and equal in magnitude to u.

particle 1 particle 2

S v 0

141

−→ −→ −→

S u −u

May 11, 2004 5:16pm

19

COLLISION EXAMPLES

BEFORE COLLISION

AFTER COLLISION

Frame of reference S ppppppppppppv 1 p p p p p p p p p ppp p p p p pp pp p ~ p pp pp pp pp p pp pp

particle 2

particle 1

θ1

................. ... .... . ..... .. ... ...... ........ .........

ppppp ppppppppp

~

speed v

..... ....... ......... ... .. .... .. ... ...... ........ ........

pp p p p p pp p pppppppppppp

at rest

θ2

v2

ppppppppp pp pp pp pp.p..p pp p p ppp ....... ~ pp pp ..........

Frame of reference S ~

ppppp ppppppppp

speed u

pppppppppppppp

speed u α

... .. .. .. .. .. ... .

........ ..... ....... .. ... .... . .. ... ...... ....... ........

speed u

α

.. ... ... ... ... .................... ... ... .. .... .. .... .... . ... .... ... .... .... .................... .

pp pp pp ppp p pp p p pp ppp pppppppppp

speed u

Figure 11: Collision between two equal particles

142

May 11, 2004 5:16pm

19

COLLISION EXAMPLES

Non-relativistic treatment Using v x = vx − u

(596)

u=v−u

(597)

for particle 1 gives

and the relationship between u and v is u=

1 2

v

(598)

The collision is trivial to treat in frame S. After the collision particle 1 makes an angle α with the x–axis and its velocity is u(cos α, sin α). By conservation of momentum, particle 2 has velocity −u (cos α, sin α). Now transform back to the frame S. vx = v x + u

and

v1x = u (1 + cos α)

and

vy = v y

(599)

For particle 1 v1y = u sin α

(600)

v2y = −u sin α

(601)

For particle 2 v2x = u (1 − cos α)

and

The speeds of the particles are α v1 = v cos 2

and

v2 = v sin

α 2

(602)

If particle 1 makes an angle (anticlockwise) θ 1 with the x–axis then v1y v1x sin α = 1 +cosα α = tan 2

(603)

tan θ1 =

Similarly if particle 2 makes an angle (clockwise) θ 2 with the x–axis then v2y v2x sin α = 1 −cosα α = cot 2

tan θ2 = −

143

(604)

May 11, 2004 5:16pm

19

COLLISION EXAMPLES

It follows that θ1 = α/2 and θ1 + θ2 = π/2. Relativistic treatment In relativistic mechanics units are frequently chosen such that c = 1. This makes the algebra a little easier. Using vx =

vx − u 1 − uvx

(605)

u=

v−u 1 − uv

(606)

for particle 1 gives

We also know that γv = γv γ (1 − uvx )

(607)

For particle 1 this gives γ = γv γ (1 − uv)

−→

1 = γv 1 − uv

(608)

Combining this with eq.(606) gives u = γv (v − u)

−→

u=

γv v γv + 1

(609)

Consider 1 = 1 − u2 γ2

(610)

and substitute for u on rhs using eq.(609) r

1 + γv 2

(611)

vx =

vx + u 1 + uv x

(612)

v=

2u 1 + u2

(613)

γ=

The inverse of eq.(605) is

For particle 1 this gives

144

May 11, 2004 5:16pm

19

COLLISION EXAMPLES

which is the inverse of eq.(609).

Consider frame S. After the collision particle 1 makes an angle α with the x–axis and its velocity is u(cos α, sin α). By conservation of momentum, particle 2 has velocity −u (cos α, sin α).

Now transform back to the frame S. vx + u vx = 1 + uv x For particle 1 u (1 + cos α) v1x = 1 + u2 cos α For particle 2 u (1 − cos α) v2x = 1 − u2 cos α The speeds of the particles are

vy γ (1 + uv x )

and

vy =

and

v1y =

α v1 = 2u cos 2

u sin α γ (1 + u2 cos α)

v2y = −

and

q

1 − u2 sin2

u sin α γ (1 − u2 cos α)

1 + u2 cos α

q

(614)

α 2

(616)

(617)

1 − u2 cos2 α2 α v2 = 2u sin 2 1 − u2 cos α If particle 1 makes an angle (anticlockwise) θ 1 with the x–axis then v1y tan θ1 = v1x 1 sin α = γ 1 + cos α 1 α = tan γ 2 Similarly if particle 2 makes an angle (clockwise) θ 2 with the x–axis then v2y tan θ2 = − v2x 1 sin α = γ 1 − cos α α 1 cot = γ 2 From these results we have 1 tan θ1 tan θ2 = γ2 < 1

(615)

(618)

(619)

(620)

(621)

It follows that θ1 + θ2 < π/2. 145

May 11, 2004 5:16pm

19

COLLISION EXAMPLES

19.2

Elastic collision of two unequal particles

Consider the elastic collision of two particles with rest masses m 1 and m2 . Let M = m2 /m1 . In the ‘laboratory’ frame S, particle 1 is moving with speed v and particle 2 is at rest. We choose the coordinate system such that particle 1 is initially moving along the x–axis. The two vectors, initial and final velocity of particle 1, will define the x–y plane. Because of conservation of 3-momentum, the final velocity of particle 2 is also confined to the x–y plane. The ‘centre-of-mass’ frame S is such that the total 3-momentum is zero. Because of conservation of 3-momentum this is true both before and after the collision. The transformation from S to S is a boost u along the x–axis. In S the initial velocities of the two particles will be opposite in direction and have magnitudes u 1 and u.

particle 1 particle 2

S v 0

−→ −→ −→

S u1 −u

Non-relativistic treatment In the centre of mass frame we have −→

m1 u1 = m 2 u

u1 = M u

(622)

Using v x = vx − u

(623)

u1 = v − u

(624)

for particle 1 gives

and the relationship between u and v is u=

1 v 1+M

(625)

Consider frame S. After the collision particle 1 makes an angle α with the x–axis and its velocity is u 1 (cos α, sin α). By conservation of momentum, particle 2 has velocity −u (cos α, sin α). Now transform back to the frame S. vx = v x + u

and 146

vy = v y

(626) May 11, 2004 5:16pm

19

COLLISION EXAMPLES

For particle 1 v1x = u (1 + M cos α)

and

v1y = u M sin α

(627)

For particle 2 v2x = u (1 − cos α)

and

The speeds of the particles are √ 1 + M 2 + 2M cos α v1 = v 1+M

and

v2y = −u sin α

v2 =

(628)

2 α v sin 1+M 2

(629)

If particle 1 makes an angle (anticlockwise) θ 1 with the x–axis then tan θ1 = =

v1y v1x M sin α 1 + M cos α

(630)

Similarly if particle 2 makes an angle (clockwise) θ 2 with the x–axis then v2y v2x sin α = 1 −cosα α = cot 2

tan θ2 = −

(631)

Relativistic treatment Again we choose c = 1. In the centre of mass frame we have −→

γ1 m1 u1 = γ m 2 u

γ 1 u1 = M γ u

(632)

Using vx =

vx − u 1 − uvx

(633)

u1 =

v−u 1 − uv

(634)

for particle 1 gives

We also know that γv = γv γ (1 − uvx )

(635)

For particle 1 this gives γ1 = γv γ (1 − uv)

−→ 147

γv γ 1 = 1 − uv γ1

(636) May 11, 2004 5:16pm

19

COLLISION EXAMPLES

Combining this with eq.(634) gives u1 =

γv γ (v − u) γ1

−→

u=

γv v γv + M

(637)

Consider 1 = 1 − u2 γ2

(638)

and substitute for u on rhs using eq.(637) γ=p

M + γv + 2M γv + 1

(639)

M2

Using eq.(634) and eq.(637) gives

u1 =

M γv v M γv + 1

(640)

Consider frame S. After the collision particle 1 makes an angle α with the x–axis and its velocity is u 1 (cos α, sin α). By conservation of momentum, particle 2 has velocity −u (cos α, sin α). Now transform back to the frame S. vx + u 1 + uv x

and

vy =

u + u1 cos α 1 + u u1 cos α

and

v1y =

u (1 − cos α) 1 − u2 cos α

and

v2y = −

vx =

vy γ (1 + uv x )

(641)

For particle 1 v1x =

u1 sin α γ (1 + u u1 cos α)

(642)

u sin α γ (1 − u2 cos α)

(643)

For particle 2 v2x =

If particle 1 makes an angle (anticlockwise) θ 1 with the x–axis then tan θ1 = = =

v1y v1x 1 u1 sin α γ u + u1 cos α sin α 1 γ u/u1 + cos α 148

(644)

May 11, 2004 5:16pm

19

COLLISION EXAMPLES

Similarly if particle 2 makes an angle (clockwise) θ 2 with the x–axis then v2y v2x 1 sin α γ 1 − cos α 1 α cot γ 2

tan θ2 = − = =

(645)

The ratio u/u1 is u/u1 =

M γv + 1 M γv + M 2

149

(646)

May 11, 2004 5:16pm

19

COLLISION EXAMPLES

19.3

Results

Results are illustrated in figures 12–15. The collision is parameterised by the three parameters: v (initial speed of particle 1), M (= m2 /m1 , ratio of rest masses) and α (scattering angle in centre of mass frame). • For equal masses the results are symmetric about α = π/2. • The relativistic scattering angles are less than the corresponding non-relativistic ones, θ1 +θ2 < π/2 for equal masses. The effect of relativity is to close up the angle between the final directions. This effect increases with v. • The final relativistic speeds are higher than the corresponding non-relativistic ones. For a large v the relativistic kinetic energy becomes much larger than the nonrelativistic kinetic energy. This energy is shared between the two particles after the collision. The relationship between the kinetic energy and speed explains the effect for speeds. • Notice how the speed of particle 2 increases rapidly with α for the case v/c = .99. • The symmetry about α = π/2 is broken for the unequal particles. • The lighter particle 1 can be back-scattered, i.e. scattering angle θ 1 > π/2.

150

May 11, 2004 5:16pm

19

COLLISION EXAMPLES

M=1 v/c=.8

scattering angles

150 2

100 50 0

1 0

20

40

20

40

60

140

160

180

60 80 100 120 140 kinetic energy (in units of rest energy of particle 1)

160

180

160

180

1

80 100 final speeds

120

1 0.5 2 0

0

1

1

0.5

2 0

0

20

40

60

80

alpha

100

120

140

Figure 12: Collision between equal masses, v/c = .8. Dashed curves are non-relativistic, solid curves are relativistic. Quantities are plotted as a function of α in the range 0 to π.

151

May 11, 2004 5:16pm

19

COLLISION EXAMPLES

M=1 v/c=.99

scattering angles

150 2

100 50

1 0

0

20

40

20

40

60

140

160

180

60 80 100 120 140 kinetic energy (in units of rest energy of particle 1)

160

180

160

180

1

80 100 final speeds

120

1 0.5 2 0

0

6

1

4 2

2 0

0

20

40

60

80

alpha

100

120

140

Figure 13: Collision between equal masses, v/c = .99. Dashed curves are non-relativistic, solid curves are relativistic. Quantities are plotted as a function of α in the range 0 to π.

152

May 11, 2004 5:16pm

19

COLLISION EXAMPLES

M=2 v/c=.8

scattering angles

150 2

100 50 0

1 0

20

40

20

40

60

140

160

180

60 80 100 120 140 kinetic energy (in units of rest energy of particle 1)

160

180

160

180

1

80 100 final speeds

120

1 0.5 2 0

0

1

1

0.5

0

2 0

20

40

60

80

alpha

100

120

140

Figure 14: Collision between unequal masses (m 2 /m1 = 2), v/c = .8. Dashed curves are non-relativistic, solid curves are relativistic. Quantities are plotted as a function of α in the range 0 to π.

153

May 11, 2004 5:16pm

19

COLLISION EXAMPLES

M=2 v/c=.99

scattering angles

150 2

100 50

1 0

0

20

40

60

140

160

180

60 80 100 120 140 kinetic energy (in units of rest energy of particle 1)

160

180

160

180

1

80 100 final speeds

120

1 0.5 2 0

0

20

6

40

1

4 2

2 0

0

20

40

60

80

alpha

100

120

140

Figure 15: Collision between unequal masses (m 2 /m1 = 2), v/c = .99. Dashed curves are non-relativistic, solid curves are relativistic. Quantities are plotted as a function of α in the range 0 to π.

154

May 11, 2004 5:16pm

20

ENERGY-MOMENTUM TENSOR

20 20.1

Energy-momentum tensor Volume elements

Let dωo be the volume of a small element of mass as measured in an inertial frame S o relative to which the mass is instantaneously at rest. S o is the rest frame. The total mass within the element is ρo dωo where ρo is the proper mass density as measured in S o . For another observer moving with speed v relative to S o the volume is dω =

dωo γv

(647)

due to Lorentz contraction in the direction of motion.

20.2

4-force density

The 4-force exerted on the element is F µ = γv ( 1c v · F , F )

(648)

The 4-force density is D µ = F µ /dωo = = =

(649)

γv ( 1c v · F , F )/dωo ( 1c v · F , F )/dω ( 1c v · D, D)

where D = F /dω is the 3-force density.

20.3

General case

Suppose there is a continuous distribution of mass over a region of space. For example, the distribution could be the molecules of an elastic body. Then there will be contributions from 1. inertia of the particles 2. mass of the potential energy of the stress (which arises from the electrostatic interaction of the particles) 3. possible further electromagnetic effects 4. heat energy due to the random motion of the particles

155

May 11, 2004 5:16pm

20

ENERGY-MOMENTUM TENSOR

In an inertial frame S let ρ0 , ρ00 , etc be the densities of inertial mass at a point due to the various contributions and v 0 , v 00 , etc the corresponding velocities, then = ρ0 v 0 + ρ00 v 00 + · · ·

P

(650)

= ρv

where P is the net density of linear momentum, ρ is the mean density of mass flow and v is the mean velocity of mass flow. P is also the current density for the mass flow so that if a volume ω is enclosed by surface σ then P · n dσ

(651)

is the rate of flow of mass out through the element of surface dσ. The outwardly pointing normal is n. Let P (i) be the current density for the flow of the x i -component of momentum and (i)

Pij = Pj

(652)

is the jth component of this current density. Indices i, j etc. refer to the space-components. Conservation of mass gives ∇ · P + ∂t ρ =

1 c2

Conservation of momentum gives

(v · D)

(653)

∂j Pij + ∂t Pi = Di

(654)

where D is an external 3-force density. These equations can be written as ∂ν T µν = D µ

(655)

where the energy-momentum tensor is T

µν

=

c2 ρ c P c P T Pij

!

(656)

If T µν is symmetric then we must have Pij symmetric also.

20.4

External force

If the external force can be expressed as D µ = −∂ν S µν

(657)

∂ν (T µν + S µν ) = 0

(658)

Then we can write eq.(655) as

We can now treat the whole system, including the external force, as being isolated and such that the total energy-momentum has zero 4-divergence. The external force contributes to the energy-momentum. It has ρ=

1 c2

S 00

and 156

Pi =

1 c

S i0

(659) May 11, 2004 5:16pm

20

ENERGY-MOMENTUM TENSOR

20.5

Incoherent cloud

For a cloud of non-interacting particles (no stress field) we have Pi = ρ v i

and

Pij = ρ vi vj

(660)

!

(661)

giving T

µν

=ρ

c2 cv T cv vi vj

Since v µ = γv (c, v) we have T µν = ρo v µ v ν

(662)

ρ = γv2 ρo

(663)

where

ρo is the proper density of proper mass since m dω γv mo = dωo /γv mo = γv2 dωo = γv2 ρo

(664)

ρ =

20.6

Fluids

If τij is the stress tensor then the force density for the stress field is −∂ j τij . Applying conservation of mass and momentum gives Pi = ρ v i +

1 c2

vk τki

(665)

and Pij

= Pi vj + τij = ρ v i vj +

1 c2

(666) vk τki vj + τij

In general if T µν is symmetric then τij is not. This contrasts with the classical theory. In the classical limit τij is symmetric.

157

May 11, 2004 5:16pm

20

ENERGY-MOMENTUM TENSOR

20.7

Perfect fluids

In a perfect fluid there is no shearing stress in the rest frame of the fluid. Therefore in S o we have τijo = p δij

(667)

where p is the pressure. It follows that the force density for the stress field is D o = −∇p and the force acts from high to low pressure. Therefore o

   

T µν = 

In any frame T

µν

c2 ρo 0 0 0

p = ρo + 2 c

0 p 0 0

0 0 p 0



0 0 0 p

   

(668)

v µ v ν + p η µν

(669)

Therefore using eq.(656) and eq.(669) T 00 = c2 ρ

(670)

ρ+

ρo +

p c2

ρo +

p c2

= c2 γv2 p c2

giving

p c2

= γv2

ρo +

−p

(671)

Also T i0 = c Pi = c γv2 giving Pi = γv2 =

(672)

ρo +

p c2

p c2

vi

ρ+

vi

vi

(673)

τki = p δki

(674)

Comparing this with eq.(665) gives −→

pvi = vk τki

Therefore there is no shearing stress in any frame and the pressure is invariant. Also the stress tensor is symmetric. Also p Pij = ρ + 2 c

vi vj + p δij

(675)

Taking p = 0 gives the incoherent cloud. 158

May 11, 2004 5:16pm

21

ELECTROMAGNETISM

21

Electromagnetism

21.1

Maxwell equations

Introduction Maxwell unified electricity and magnetism through equations that were published in 1865. These equations are the culmination of a century of experimental work into electric and magnetic phenomena. The theoretical expression of the results in terms of vector calculus is a key feature of Maxwell’s contribution. Using these equations Maxwell predicted that electromagnetic waves travel through space at the same speed as light. He asserted that light was an electromagnetic phenomenon, bringing together electromagnetism and optics. In addition the equations allow waves with wave-length different from visible light. Much of the physics following Maxwell can be viewed as an exploration of this spectrum. The Maxwell equations are not invariant under Galilean transformations. They are invariant under Lorentz transformation as discovered by Lorentz before Einstein published his special theory of relativity. Our purpose is to make this invariance explicit by writing the equations in a 4-tensor form. The simplest result is the Coulomb law which gives the force between two electric charges q1 and q2 at rest i.e. F 12 =

q1 q2 r r3

(676)

in gaussian units, where F 12 is the force on q1 due to q2 and r = r 1 − r 2 is the position of q1 relative to q2 . If you regard q1 as a test charge then as it is moved from place to place it will experience different forces. The field theory viewpoint is to say that the presence of q2 produces a force-field throughout space called the electric field. The test charge detects this electric field as it is moved. The definition of the electric field E due to q 2 is E(r 1 ) = =

1 F 12 q1 q2 r r3

(677)

The charge q2 is said to be the source of the electric field. The Biot and Savart law concerns the force between electric currents, a current being charge in motion. A field theory viewpoint sees current as the source of a force-field called the magnetic field 27 B and this field can be detected by a test current moved throughout space. The Maxwell equations govern (classical) electromagnetic phenomena and relate the electromagnetic fields E and B in a vacuum (no media). Sources are represented by j, the current density, and ρ, the charge density. These are the microscopic equations. The macroscopic equations imply a medium other than the vacuum. They can be obtained from the microscopic equations by statistical averaging. 27

More correctly called the magnetic induction

159

May 11, 2004 5:16pm

21

ELECTROMAGNETISM

Choice of units Modern texts on electromagnetism use SI units, while older texts on relativity and quantum theory use gaussian units. We choose the latter. The equations Coulomb law: ∇ · E = 4π ρ

(678)

The charge density is the source of the electric field. By integrating over a volume and using the Gauss theorem you obtain the Gauss law: the total flux of the electric field out of a closed surface is proportional to the total charge enclosed. This result is directly related to the inverse square Coulomb law for the force between two charges. The non-zero divergence of the electric field from a point in space reflects a non-zero charge density at that point. This is consistent with the electric field lines being radial from point charges. Analogous to the Coulomb law is: ∇·B =0

(679)

In this case there is no magnetic charge density. Permanent magnets always have two poles. When a magnet is divided you obtain two magnets, each with two poles. It is not possible to isolate a magnetic pole. A vector field with zero divergence is said to be solenoidal, a name associated with magnetism. Faraday law: ∇ × E + ∂w B = 0

(680)

where w = ct and ∂w = ∂/∂w. A time-varying magnetic field produces an electric field. Amp`ere law: ∇ × B − ∂w E =

4π c

j

(681)

This is the differential form of the Biot and Savart law for the force between two electric currents. The current density acts as a source for the magnetic field. The term involving the time-varying electric field was added by Maxwell to ensure that the equations are consistent with the continuity equation. We can view the time-varying electric field as a source for the magnetic field, in fact this term is known as the displacement current. This effect was not at first detected because the electric field must be rapidly varying. There is a remarkable symmetry in the Maxwell equations, a symmetry partly broken by the absence of magnetic charge. This has been the subject of much speculation in theoretical physics. There are two vector equations and two scalar equations. Two equations are source-free while the other two depend on the source through j and ρ. To apply these equations to practical problems it is necessary to specify associated boundary conditions. We will not treat this. 160

May 11, 2004 5:16pm

21

ELECTROMAGNETISM

21.2

Continuity equation

Implicit in the Maxwell equations is the continuity equation for the source. ∂t ρ + ∇ · j = 0

(682)

This equation expresses conservation of charge. To see this consider a volume Ω and integrate the equation over this volume. Z

∂t

ΩZ

∂t ρ dΩ = − Ω

ρ dΩ = −

Z

∇ · j dΩ

ZΩ

(683)

j · n dΣ

Σ

where Σ is the surface enclosing Ω and n is an outward pointing normal. We can interpret this equation as the rate of increase of the charge in Ω is equal to the total flux of current into the volume.

21.3

Lorentz force

The Lorentz force equation gives the force F acting on a point charge q moving with velocity v in the presence of electromagnetic fields. v F =q E + ×B c

(684)

By measuring the motion of a charged particle we can determine the electromagnetic fields.

21.4

Scalar and vector potentials

In electromagnetic theory E and B can be expressed in terms of a vector potential A and a scalar potential Φ. ∇·B =0

−→

B =∇×A

(685)

and therefore ∇ × E + ∂w B = 0

−→

E = −∇Φ − ∂w A

(686)

Using the potentials simplifies electromagnetism because the six components of E and B are replaced by four quantities in the potentials. However it is the electromagnetic fields that are measurable through eq.(684). B is solenoidal since its divergence is zero. E + ∂w A is irrotational since its curl is zero.

161

May 11, 2004 5:16pm

21

ELECTROMAGNETISM

21.5

Gauge transformations

The electromagnetic fields are invariant under the gauge transformation: Φ

)

A

−→

(

Φ − ∂w χ

(687)

A + ∇χ

where χ is an arbitrary function of position and time. Thus the potentials are not uniquely specified.

21.6

Lorentz gauge and wave-equations

We can choose χ such that the potentials satisfy the Lorentz gauge condition ∂w Φ + ∇ · A = 0

(688)

This gives the Lorentz gauge. The advantage of the Lorentz gauge is that it allows the relativistic invariance of the equations to appear clearly. In the Lorentz gauge the equations satisfied by the potentials become:

2 − ∇2 Φ = 4π ρ ∂w

2 ∂w − ∇2 A =

4π c

(689)

j

2 = ∂ 2 /∂w 2 . Therefore the potentials satisfy inhomogeneous wave-equations and where ∂w are coupled only by the Lorentz gauge. The waves propagate with the speed of light. It is possible to obtain wave-equations for the electric and magnetic fields directly from the Maxwell equations. These equations for the potentials are equivalent.

These wave-equations plus the Lorentz gauge condition are equivalent to the original Maxwell equations. The symmetry of these equations is immediately suggestive of how they can be written in 4-tensor form.

21.7

Coulomb gauge

The other gauge commonly used is the Coulomb gauge i.e. ∇·A=0

(690)

A is then solenoidal. This is also known as the radiation or transverse gauge. The reason is that it gives rise to the transverse electric and magnetic fields involved in radiation problems.

162

May 11, 2004 5:16pm

21

ELECTROMAGNETISM

21.8

Exercises

• Use the Maxwell equations to show that the continuity equation holds. • If Φ and A are potentials consider the gauge transformation to new potentials Φ 0 and A0 . Find the equation satisfied by χ such that the new potentials satisfy the Lorentz gauge condition. • If the potentials Φ and A satisfy the Lorentz gauge condition, show using the Maxwell equations that the potentials satisfy inhomogeneous wave-equations. • Find the equations satisfied by the potentials Φ and A in the Coulomb gauge.

163

May 11, 2004 5:16pm

22

TENSOR FORMULATION OF ELECTROMAGNETISM

22

Tensor formulation of electromagnetism

Prior to Einstein’s work Lorentz had shown that the Maxwell equations are formally invariant under a Lorentz transformation. To demonstrate this invariance of the Maxwell equations we will rewrite them in 4-tensor form.

22.1

4-current

Let ρo be the proper charge density i.e. as measured in the rest frame. It is an invariant. Define the 4-current as j µ = ρo v µ

(691)

= ρo γv (c, v) = ρ (c, v) = (c ρ, j) where ρ = ρo γv is the charge density and j = ρv is the current density. j µ is a 4-vector.

22.2

Continuity equation

The continuity equation eq.(682) can be written as: ∂µ j µ = 0

(692)

where ∂µ = (∂w , ∇)

and

∂ µ = η µν ∂ν = (−∂w , ∇)

(693)

Aµ = ηµν Aν = (−Φ, A)

(694)

The 4-current has zero 4-divergence.

22.3

4-potential

Now define the 4-potential Aµ = (Φ, A)

and

Then the Lorentz gauge condition eq.(688) can be written as ∂µ Aµ = 0

(695)

i.e. the 4-potential has zero 4-divergence. The wave-equations eq.(689) for the potentials become µ ∂ α ∂α Aµ = − 4π c j

(696)

where the lhs contains the d’Alembertian operator (a scalar) i.e. 2 ∂ α ∂α = −∂w + ∇2

(697)

From the quotient theorem it follows from eq.(696) that A µ is a 4-vector. 164

May 11, 2004 5:16pm

22

TENSOR FORMULATION OF ELECTROMAGNETISM

22.4

Electromagnetic tensor

We now relate the electromagnetic fields, E and B, to a 4-tensor. Since there are 6 independent components, this suggests using a skew-symmetric tensor of rank 2. In terms of the 4-potential it is given by: Fµν

= ∂ µ Aν − ∂ ν Aµ

(698)

= −Fνµ

This is the covariant electromagnetic tensor. Using eq.(685–686) for E and B in terms of A and Φ you can show that

Fµν For example





0 −Ex −Ey −Ez  0 Bz −By    =   0 Bx  0

(699)

F01 = ∂0 A1 − ∂1 A0

(700)

= ∂ w Ax + ∂ x Φ = −Ex

and F12 = ∂1 A2 − ∂2 A1

(701)

= ∂ x Ay − ∂ y Ax = [∇ × A]z = Bz

The associated contravariant tensor is also skew-symmetric and is given by F µν

= η µα η νβ Fαβ 

  =  

22.5

(702) 

0 E x Ey Ez 0 Bz −By    0 Bx  0

Maxwell equations

The Maxwell equations can be written as tensor equations involving F µν ∂µ Fνζ + ∂ν Fζµ + ∂ζ Fµν = 0

(703)

(zero cyclic divergence) and ∂ν F µν =

4π c

165

jµ

(704) May 11, 2004 5:16pm

22

TENSOR FORMULATION OF ELECTROMAGNETISM

In eq.(703) the indices (µνζ) can take 4 distinct sets of values (012), (013), (023) and (123). Due to the cyclic nature of the indices, their order in each set does not matter. Other possibilities ( i.e. two indices the same) give zero identically on the lhs due to the skewsymmetry of Fµν . These correspond to the 4 source-free Maxwell equations. The combinations involving index 0, which contain a time derivative, give the vector Faraday equation, eq.(680). The last combination is the scalar equation, eq.(679). Eq.(704) corresponds to the 4 Maxwell equations which include the source. The case µ = 0 gives eq.(678) (Coulomb law) while µ = 1 . . . 3 gives eq.(681) (Amp`ere law).

22.6

Dual electromagnetic tensor density

The dual electromagnetic tensor density is defined by G µν =

1 2

eµναβ Fαβ

(705)

where eµναβ is the 4 dimensional permutation symbol. Some properties of the symbol are: • eµναβ is completely skew-symmetric • e0123 = 1 • eµναβ =

  

0 1   −1

if any index is repeated if µναβ is an even permutation of 0123 if µναβ is an odd permutation of 0123

(706)

• eµναβ is an isotropic, rank 4, tensor density ( i.e. relative tensor weight 1) G µν is skew-symmetric. It is a tensor density because it is the inner product of a tensor and a tensor density. The components are easy to obtain, for example G 01 = =

1 2 1 2

(e0123 F23 + e0132 F32 )

(707)

(F23 − F32 )

= F23 = Bx

and G 12 = =

1 2 1 2

(e1203 F03 + e1230 F30 )

(708)

(F03 − F30 )

= F03

= −Ez 166

May 11, 2004 5:16pm

22

TENSOR FORMULATION OF ELECTROMAGNETISM

so that we find

G µν



0 B x By Bz  0 −E Ey  z =  0 −Ex 0

Contrast

F µν



    

Gµν

and



0 E x Ey Ez  0 Bz −By    =   0 Bx  0

Fµν

and





0 −Bx −By −Bz  0 −Ez Ey    =   0 −Ex  0 



0 −Ex −Ey −Ez  0 Bz −By    =   0 Bx  0

(709)

(710)

• The dual G µν is obtained from F µν by the substitutions E −→ B and B −→ −E. • The source-free Maxwell equations can be written as ∂ν G µν = 0

(711)

which is equivalent to eq.(703).

22.7

Invariants

From Fµν and Gµν we can form the following invariants. • F µν Fµν

= −G µν Gµν 2

(712) 2

= 2 (B − E )

It follows that (B 2 − E 2 ) is a scalar. • F µν Gµν = −4 E · B

(713)

It follows that E · B is a scalar density. For proper transformations E · B is unchanged. But if det(Λµν ) = −1 then it changes sign. If E and B are orthogonal ( i.e. E ·B = 0) in one inertial frame then they are orthogonal in all frames.

22.8

Transformation of fields

We can derive the transformation properties of E and B from the way that F µν transforms i.e. F

µν

= Λµα Λν β F αβ 167

(714) May 11, 2004 5:16pm

22

TENSOR FORMULATION OF ELECTROMAGNETISM

where Λµα are coefficients corresponding to the Lorentz transformation. This equation can be written in matrix form as

F

µν

= Λ µα

F αβ

Λν β

T

(715)

and, using the form of Λµα for a boost of speed u in the x–direction, gives     





γ −uγ/c 0 0 0 Ex Ey Ez  −uγ/c γ 0 0 0 Bz −By  −E x    =   0 0 1 0 0 Bx  −E y −Bz 0 0 0 1 −E z B y −Bx 0    

×    

×

    

(716) 

0 Ex Ey Ez −Ex 0 Bz −By    −Ey −Bz 0 Bx  −Ez By −Bx 0 γ −uγ/c 0 0 −uγ/c γ 0 0 0 0 1 0 0 0 0 1

    

p

where γ = 1/ 1 − (u/c)2 ). After some algebra you obtain the following results E x = Ex

E y = γ(Ey − uBz /c)

B x = Bx

By = γ(By + uEz /c)

E z = γ(Ez + uBy /c)

(717)

Bz = γ(Bz − uEy /c)

The inverse transformation is obtained by interchanging barred and unbarred quantities, and letting u become −u.

We can now obtain the transformation corresponding to a boost u. Consider the electric field E

= E xi + E y j + E z k

(718)

= Ex i + γ(Ey j + Ez k) − (γu/c) (Bz j − By k) = γE − (γ − 1)Ex i − (γu/c) (Bz j − By k)

= γE − (γ − 1)Ex i + (γu/c) i × (Bz k + By j) = γE − (γ − 1)Ex i + (γu/c) i × B

Now let ui = u to obtain E = γE −

γ −1 (u · E)u + (γ/c) (u × B) u2

168

(719)

May 11, 2004 5:16pm

22

TENSOR FORMULATION OF ELECTROMAGNETISM

Similarly for B B = γB −

γ−1 (u · B)u − (γ/c) (u × E) u2

(720)

If these equations hold for u = ui thenqsince they are vector equations they will hold for p any u. In general u = (ux , uy , uz ), u = u2x + u2y + u2z and γ = 1/ 1 − (u/c)2 .

• E and B are mixed by the Lorentz transformation. • A pure E field will have both E and B components in another inertial frame of reference. In general pure fields are atypical. • The general electromagnetic field cannot be transformed into a pure field by a Lorentz transformation. If this could be done it would imply that the invariant E ·B = 0 which is not the case in general. • A charged particle is at rest in the frame S. The stationary charge generates an electric field E. But the magnetic field B is zero. Therefore E · B = 0. If S moves relative to frame S with velocity v, then an observer in S will see the charged particle move with velocity v. From the inverse of eq.(719–720) the corresponding fields in S are E = γE −

γ−1 v2

(v · E) v

and

B = (γ/c) (v × E)

(721)

p

where γ = 1/ 1 − (v/c)2 . From these formulae we can confirm easily that E · B = 0.

22.9

Fields produced by a moving charge

Consider a frame of reference S, with space coordinates (x, y, z) and time t. A particle P , with charge q, is moving with uniform speed v. We will take the position of the particle to be (vt, √ b, 0) so that the particle is moving in the x–direction and its distance from the origin O is v 2 t2 + b2 . This distance is a minimum when t = 0 and r = b. The minimum distance b is known as the impact parameter. See figure 16.

Now consider the rest frame of the particle, S 0 , with space coordinates (x0 , y 0 , z 0 ) and time t0 . S 0 moves relative to S with a speed v in the x–direction. We assume that the coordinate axes are parallel and that when t = t0 = 0 the origins coincide. 169

May 11, 2004 5:16pm

22

TENSOR FORMULATION OF ELECTROMAGNETISM

frame S

y–axis

.................. .......... ...... ...... . . . . . ....... ...... ....... ....... ...... . . . . . . ....... ....... ...... ....... ...... . . . . . ...... ....... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ................ ..... ..... ..... ..... ..... ..... ..... ..... . . . . . . . . . . . . . . . . . . . . . . . . . . . .

r

w pppppppppppppppppppppppp

ψ

v

b

vt

x–axis

O

Figure 16: Moving charge in frame S The coordinates in the two frames are connected by the Lorentz transformation x0 = γ (x − vt) y

0

= y

z

0

= z

0

= γ (t − vx/c2 )

t

(722)

p

where γ = 1/ 1 − (v/c)2 .

The position of the particle is (0, b, 0).

The electromagnetic fields due to the particle in its rest frame are simple. In particular they are independent of time. First of all, the electric field is given by the Coulomb law, and secondly, the magnetic field is zero. Therefore q E 0 = 03 r 0 and B0 = 0 (723) r where r 0 is the position of a point measured from the particle i.e. r 0 = (x0 , y 0 − b, z 0 )

r 02 = x02 + (y 0 − b)2 + z 02

and 170

(724)

May 11, 2004 5:16pm

22

TENSOR FORMULATION OF ELECTROMAGNETISM

Now use the inverse transformation equations for the electromagnetic fields Ex = Ex0

Bx =

Ey = γ(Ey0 + vBz0 /c)

Bx0

By =

γ(By0

−

Ez = γ(Ez0 − vBy0 /c)

vEz0 /c)

Bz =

γ(Bz0

+

(725)

vEy0 /c)

Since B0 = 0 we have

Bx = 0

Ex = Ex0

By =

Ey = γEy0

Ez = γEz0

−vγEz0 /c

Bz =

(726)

vγEy0 /c

We can now substitute for the fields as observed in S 0 and also use the Lorentz transformation for the coordinates E(t, x, y, z) =

q γ 03 r

(x − vt, y − b, z)

B(t, x, y, z) = (v/c) γ

q r 03

(727)

(0, −z, y − b)

with r 02 = γ 2 (x − vt)2 + (y − b)2 + z 2

(728)

• In the particle’s rest frame E0 =

q 0 r r 03

(729)

and the direction of the electric field is radial from the particle. Thus the field lines will be radial straightlines from the particle. The distribution of the field lines gives a measure of the field strength. When the lines are close together the field is strong. In the particle’s rest frame the distribution of the field lines is isotropic i.e. the same in all directions. This arises from the dependence of the electric field on r 0 , the distance from the particle to the point of observation. See figure 17. • In frame S we have

q r r 03

(730)

r = (x − vt, y − b, z)

(731)

E(t, x, y, z) = γ where

is the position vector measured from the particle. Therefore the field lines for the moving particle are again straightlines radial from the particle. However the distribution of the field lines is not isotropic. 171

May 11, 2004 5:16pm

22

TENSOR FORMULATION OF ELECTROMAGNETISM

. ...... .. .. .... . ........ .... ....... ........... ...... ... . ............ . . . . ... ..... ... ... ... .. ... ... .............. ... ... ... .. ....... .... .. ..... ... . . . . ... . . . .. ... ..... ... .... .... ................ ..... ................. ... ... ... ..... .............. .............. .. ... ... ..... ........ ......... .... .. .. ..... . . . ......... .......... . . . ... ... .. ..... ........ . . .......... . . . . . . . . . . . . . .......... .... .. ......... .......... ............................. .......... ................... ....... .................. ................. .......... ....................................... . . . . . . . . . . .. . ......... ..... .. .. ... .......... ......... .......... .... ..... ...... .................. . . . . . . . . ......... ... ... ... ......... ..... ......... ... ... .... ......... . ..... ........ ............... ... .... ... ..... .............. ... .. ............ ..... ... ................. ... .. ..... ... . . . . . . . .... ... . ... . . . . . . ... ... . ... . . . . . . . . ... ... .... .... ... ... ....... ... . ... ........... ... ... ....... .............. ... ... .... .. . .... .. . ........ .. . . . . .. ........ ... .. ..... .... ...... .

z

Charge at rest

Figure 17: Field lines for a charge at rest To see this we need to express r 0 in terms of r. Let (x − vt) = r cos ψ, so that ψ is the angle between the direction of motion and the position vector measured from the particle, then

Therefore

q

(y − b)2 + z 2 = r sin ψ

r 02 = γ 2 (x − vt)2 + (y − b)2 + z 2 2

2

2

(732)

(733)

2

= r [γ cos ψ + sin ψ]

= r 2 [γ 2 (1 − sin2 ψ) + sin2 ψ] = r 2 [γ 2 − (γ 2 − 1) sin2 ψ] = r 2 γ 2 [1 − (v/c)2 sin2 ψ]

giving E=

1 q r 2 3 γ r [1 − (v/c)2 sin2 ψ] 23

(734)

When the position vector is nearly parallel to the direction of motion we have ψ near 0 and π. In this case there is an outside factor of 1/γ 2 which reduces the strength at large speeds. In contrast when the position vector is at right angles to the direction of motion we have ψ = π/2 and the outside factor becomes γ which increases the strength at large speeds. This anisotropy in the field strength is reflected in field lines that have a whiskbroom appearance. The effect is suggestive of a Lorentz contraction in the direction of motion. See figure 18.

172

May 11, 2004 5:16pm

22

TENSOR FORMULATION OF ELECTROMAGNETISM

. ...... .. .. ..... ...... .. .. ........ .......... .... ........ .... ........ ....... ..... . . ....... ............... .. ... ......... ... .. ... .. ... ... . ... ... ... .. ... .... ............. ... ... ... . ..... ....... . ... .. ... .. .. ... ............. ... ... ... .. .. ... ..... . ... . . . . . . . . . . ... ... ... .. .. .. ... .... ... ... ... .. .. ... .. ... .... . . . .. .... .... ....... ........... ..... .... ... ...... ..... ... ... ... ...... ...... ... ... .. ... ...... .. .... .. .... ..... .. ...................... ..................... .............. ....... .............. ............................. . . ..... ..... ......... ..... . . .. . .. .... .. ... ... .. ..... ..... ... ... ... ... .... .... ... ...... .... ... .. .... .... .. .. .... ....... .......... .... .... . . . .... ... .. .. .. .. .. .. .... ... .... ........ ........ ..... . ..... . .. ... ... .. .... ... ... . ............. . . ... ... ... ... .... .. .... . . . . ... ... ... .. . ... .. . . . . . ... .. .. . ... .. .. . . . . . . . ... ... ... .......... .. ..... ... . . . . . ... .. ....... ..... .. .. ... ... ......... ... . ..... .... .. .. ... ........ .................... ... .. ..................... .... ... ......... ....... .......... ... .

ppppppppppppppp p p p pp p pp p v

z

Moving charge

Figure 18: Field lines for a moving charge • Since B0 = 0 we have E 0 · B0 = 0 The results in frame S also give E ·B =0 as they must, since this dot product is a scalar density.

We will evaluate the fields at the point O. q E(t) = − γ 03 r

B(t) = −(v/c) γ

(vt, b, 0) q r 03

(735)

(0, 0, b)

with r 02 = γ 2 v 2 t2 + b2 2 2

2

(736) 2

= γ v (t + T ) where T = b/(γv). Therefore "

1 q E(t) = − 2 2 γ v (t2 + T 2 ) 23 B(t) = −(v/c)

"

#

(t, γT, 0)

q 1 γ 2 v 2 (t2 + T 2 ) 23 173

#

(737)

(0, 0, γT ) May 11, 2004 5:16pm

22

TENSOR FORMULATION OF ELECTROMAGNETISM

It is clear from these formulae that as v/c approaches 1, the electric field is dominated by the y–component and also that Bz becomes equal to Ey . The electric and magnetic fields are orthogonal and equal in magnitude. This is characteristic of a photon i.e. a fast moving charged particle can be modelled by a photon. Define f (t, v, T ) =

t

1 γ 2 v2

(t2

+

and

3

T 2) 2

g(t, v, T ) =

1 T 2 2 γv (t + T 2 ) 23

(738)

then Ex = −qf

Ey = −qg

Bz = −(v/c)qg

(739)

The functions f and g are plotted in figure 19 and figure 20 for the v/c–values 0.1, 0.5, 0.8 and 0.9. b is set equal to 1 so that the corresponding values of cT are 9.9, 1.7, 0.75, 0.48. T is a time interval over which the field strengths are appreciable. The function f is an odd function of t and its time integral is zero. If the observing detector has appreciable inertia then it will not respond to the x–component of the electric field. The effect as v/c increases is to shorten the time interval in which the field acts but there is no change in the maximum field strength in the x–direction. In contrast the function g has a maximum that rises rapidly with increasing v/c. Again the time interval over which the field acts shortens. 0.5 0.4

b=1 .1

0.3 0.2 0.1 f 0

.9

.5

.8

−0.1 −0.2 −0.3 −0.4 −0.5 −10

−8

−6

−4

−2

0

2

4

6

8

10

w

Figure 19: Plot of f against w = ct for b = 1 and a range of v/c–values

174

May 11, 2004 5:16pm

22

TENSOR FORMULATION OF ELECTROMAGNETISM

3 b=1 2.5

2 g 1.5

1

0.5

.1

.8

.5 0 −10

−8

−6

−4

.9

−2

0

2

4

6

8

10

w

Figure 20: Plot of g against w = ct for b = 1 and a range of v/c–values

175

May 11, 2004 5:16pm

22

TENSOR FORMULATION OF ELECTROMAGNETISM

22.10

Lorentz force

The 4-force F µ which acts on a charged particle (rest mass m, charge q, velocity v) moving in an electromagnetic field (Fµν ) is given by F µ = (q/c) F µν vν p

(740)

Using vµ = γ (−c, v) with γ = 1/ 1 − (v/c)2 , we can write this equation in matrix form as     

F0 F1 F2 F3

    



 

  

−c 0 Ex Ey Ez   −Ex 0 Bz −By   vx   −Ey −Bz 0 B x   vy vz −Ez By −Bx 0

  

   

= (qγ/c) 



= (qγ/c) 

v·E c Ex + [v × B]x c Ey + [v × B]y c Ez + [v × B]z



    

(741)

The 4-force is related to the total energy E and 3-momentum p by µ

F =γ

1 dE dp , c dt dt

(742)

where E = γ m c2 and p = γ m v. Therefore dE =qv·E dt

22.11

dp v = q E + ×B dt c

and

(743)

Electromagnetic energy tensor

The electromagnetic energy tensor is defined to be Θµν =

1 ηαβ F µα F βν + 4π

1 4

η µν Fαβ F αβ

It has the following properties:

(744)

• It is symmetric, Θµν = Θνµ . • It can be expressed in partitioned form as µν

(Θ ) =

−U −N /c −N /c Pij

!

(745)

where – Energy density: U=

1 (E 2 + B 2 ) 8π

176

(746) May 11, 2004 5:16pm

22

TENSOR FORMULATION OF ELECTROMAGNETISM

– Poynting vector: N =

c (E × B) 4π

(747)

– Maxwell tensor: Pij =

1 h Ei Ej + B i Bj − 4π

1 2

δij (E 2 + B 2 )

i

where

i, j = 1, 2, 3 (748)

• It is traceless, Θµµ = 0. • The divergence of Θµν is ∂ν Θµν =

1 µν F jν c

(749)

In matrix form     

∂ν ∂ν ∂ν ∂ν

Θ0ν Θ1ν Θ2ν Θ3ν

    

   

= (1/c)     

= 

 

0 Ex Ey Ez −c ρ  −Ex 0 Bz −By    jx   −Ey −Bz 0 B x   jy jz −Ez By −Bx 0

(1/c) j · E ρ Ex + (1/c) [j × B]x ρ Ey + (1/c) [j × B]y ρ Ez + (1/c) [j × B]z



    

(750)

   

• Taking µ = 0 gives the Poynting theorem: −∂t U − ∇ · N = j · E

(751)

− c12 ∂t Nx + ∂x Pxx + ∂y Pxy + ∂z Pxz = [ρ E + (1/c) j × B]x

(752)

• Taking µ = 1 gives:

with similar equations for the y– and z–components.

The Poynting theorem expresses energy conservation. To see this consider the Lorentz force on a charge F µ = (q/c) F µν vν

(753)

∂t E = q v · E

(754)

This leads to

where E is the particle energy. 177

May 11, 2004 5:16pm

22

TENSOR FORMULATION OF ELECTROMAGNETISM

Now consider a charge density giving the force density 1 D µ = F µν jν c leading to ∂t E = j · E

(755)

(756)

where E is now the energy density. Using this we can express the Poynting theorem as ∂t (U + E) + ∇ · N = 0

(757)

We recognise this as an expression for energy conservation, where U is an energy density associated with the electromagnetic field, E is associated with the particles and N is a current density associated with the energy of the electromagnetic field. Consider a closed system consisting of a fluid of charged particles. There will be an energymomentum tensor T µν associated with the mass flow. In addition there will be the electromagnetic field which exerts an external force on the particles. This is related to the energy-momentum tensor as follows D µ = ∂ν T µν

(758)

Since in this case the external force density is D µ = ∂ν Θµν

(759)

∂ν (T µν − Θµν ) = 0

(760)

we can combine the equations as

The electromagnetic field behaves as though it has a mass density ρ = U/c2

(761)

P = N /c2

(762)

and a mass current density

22.12

Exercises

• Prove that Θµν is symmetric. • Prove that Θµν is traceless. • Show that Θ00 = −U. • Show that Θ0i = −Ni /c where i = 1, 2, 3. • Prove that the divergence of Θµν is

1 µν F jν c [For this you will need to use the covariant Maxwell equations.] ∂ν Θµν =

178

May 11, 2004 5:16pm

23

WAVE MOTION

23 23.1

Wave motion Wave-equation and solution

Consider the following wave-equation for light ∂ ν ∂ν Aµ = 0

(763)

where Aµ is the 4-potential and the 4-current j µ = 0. The solution is given by Aµ = Πµ exp(i kν xν )

(764)

with k µ kµ = 0 from the wave-equation. Here Πµ is a fixed 4-vector known as the 4polarisation. Let µ

k =

ω ,k c

(765)

This is known as the 4-propagation. It is a 4-vector from the quotient rule since k µ xµ is a scalar. Then kµ xµ = −ωt + k · r

(766)

Since k ν kν = 0 it follows that k =| k |=

ω c

=

2π λ

=

2πν c

(767)

where λ, the wavelength, and ν, the frequency, satisfy c = λν. These relations must hold in all inertial frames. In the Lorentz gauge we have 0 = ∂ µ Aµ

(768) µ

ν

= i kµ Π exp(ikν x ) so that kµ Πµ = 0

23.2

Transformation of 4-propagation

Suppose that we consider a boost along the x-axis then µ

k = Λ µν k ν

(769)

where    

(Λµν ) = 

γ − uc γ − uc γ γ 0 0 0 0 179

0 0 1 0

0 0 0 1

    

(770)

May 11, 2004 5:16pm

23

WAVE MOTION

It follows that k x = γ (kx −

uω ) c2

(771)

k y = ky k z = kz ω = γ (ω − u kx ) Note the similarity between 4-propagation kµ =

ω ,k c

E ,p c

(772)

and 4-momentum µ

p =

(773)

We can use the transformation formulae for 4-momentum with the substitutions E → ω and p → k.

We now consider the case k = −k ( cos α, sin α, 0) which is an incident wave making an angle α with the x-axis. This gives uω (774) k cos α = γ (k cos α + 2 ) c k sin α = k sin α ω = γ (ω + uk cos α) From these we obtain the Doppler formula ω =ωγ 1+

u c

cos α

and the aberration formulae

ω cos α = γ ω cos α + ω sin α = ω sin α

(775) u c

(776)

which give cos α =

cos α + uc 1 + uc cos α

and

sin α =

sin α γ(1 + uc cos α)

(777)

and tan α = tan 12 α = =

sin α γ (cos α + uc )

(778)

sin α 1 + cos α s

c−u tan 21 α c+u

The aberration formulae describe the visual appearance of moving objects. Clearly their appearance is distorted by the motion and this is expressed through the change from α to α. 180

May 11, 2004 5:16pm

23

WAVE MOTION

23.3

Doppler effect

We can express the Doppler effect in terms of frequency and wavelength in the following ways u c u c

ν = νγ 1+ λ = λγ 1+

cos α

cos α

(779)

For pure radial motion ( i.e. α = 0) suppose that the light source is at rest in S. Then λ = λo is the proper wavelength of the light. An observer at rest in S sees the light source receding with speed u. Then λ = λo γ 1 + = λo > λo

q

c+u c−u

u c

(780)

The wavelength appears increased and the frequency is decreased. This is known as a red-shift. The wavelengths of visible light is conveniently measured in terms of ˚ Angstroms −10 ˚ (1A=10 m) and the spectrum is

A) A) ... blue(4000 ˚ red(7600 ˚

In general define ur = u cos α to be the radial speed of the source relative to frame S. Then λ = λo γ 1 +

ur c

(781)

This is the pre-relativistic result except for the γ factor. This provides an example of the relativity principle. It does not matter whether you consider the observer in motion or the source. For transverse motion ur = 0 ( i.e. α = π2 ) giving λ = λo γ

(782)

This is the transverse Doppler effect. It arises from time dilation i.e. the source acts as a clock which runs slower. Suppose n = ν o ∆τ is the number of waves emitted in the rest frame of the source in the proper time interval ∆τ and ν o is the proper frequency. In the frame of the observer we see n waves arrive in a time ∆t indicating a frequency ν given by n = ν ∆t. Since ∆t = γ ∆τ we have νo ν

=

∆t ∆τ

=γ

(783)

or λ = λo γ

(784)

181

May 11, 2004 5:16pm

23

WAVE MOTION

From the general formula λ = λo γ 1 +

= λo 1 +

ur c

ur c

+

(785) u2 2c2

+...

the first two terms on the right are the pre-relativistic result while the third is due to time dilation.

182

May 11, 2004 5:16pm

24

QUANTUM THEORY

24 24.1

Quantum theory de Broglie waves

In considering the theory of black-body radiation Planck emitted in definite ‘quanta’ of energy i.e.

28

in 1900 suggested that light is

E =hν =h ¯ω where h is Planck’s constant and h ¯=

(786)

h 2π .

Albert Einstein in 1905 explained the photo-electric effect by extending Planck’s quantum idea. He said that light behaves as though it is a particle known as a photon. The photon has energy E = h ¯ ω as before and momentum p=

h ¯ω E = =h ¯k c c

(787)

This agrees with the formula E 2 = p2 c2 + m2o c4 if we set mo = 0. The photon can be considered as a limiting particle travelling at speed c such that the inertial mass m = γv mo = cE2 remains finite while γv → ∞ and mo → 0. This leads us to associate the particle properties of a photon to the wave properties in the following way pµ = h ¯ kµ

(788)

In 1923 Louis de Broglie proposed that waves could be associated with any particle using this expression. The idea arose entirely from the beauty of the mathematics when one set up the equations in relativistic form. The experiment of Davisson and Germer in 1927 observed electron diffraction which confirmed that a particle can have wave properties. Some consequences of this association are 1. The direction of motion of the wave is the same as the velocity of the particle. 2. p = m v = h ¯ k = hVν where V is the speed of the wave. Note that only photons travel at the speed of light. 3. E = mc2 = h ¯ω =hν It follows that v V = c2 so that V > c when v < c. This suggests a contradiction with relativity but in fact the wave cannot carry information at the speed V . The relevant speed is the group velocity which is less than c. 28 Max Planck (1858–1947), German physicist, originated quantum theory, making 1900 the transition between classical and modern physics

183

May 11, 2004 5:16pm

24

QUANTUM THEORY

24.2

The quantum recipe

Suppose that a free particle can be represented by a plane-wave ‘in some way’ Ψ(x) = exp(i kµ xµ )

29 .

Then (789)

where Ψ(x), the wave-function, is a function of x µ . Then ∂µ Ψ(x) = i kµ Ψ(x) i pµ Ψ(x) = h ¯

(790)

so that −i¯h∂µ operating on the wave-equation is equivalent to the 4-momentum p µ i.e. pµ → −i¯h∂µ

(791)

or when resolved into space and time components (−E/c, p) → −i¯h (∂t /c, ∇)

(792)

giving E → i¯h∂t

and

p → −i¯h∇

(793)

We can apply this recipe to energy-momentum equations and thus obtain corresponding wave-equations.

24.3

Schr¨ odinger equation

Consider the non-relativistic free-particle equation p2 = E − m o c2 2mo This becomes the Schr¨odinger

30

(794)

wave-equation −

h ¯2 2 ∂Ψ ∇ Ψ = i¯h 2mo ∂t

(795)

where the energy is less the rest energy. 29 30

The interpretation of the wave is a thorny question that is still perplexing physicists. Erwin Schr¨ odinger (1887–1961), Austrian physicist, the founder of wave mechanics

184

May 11, 2004 5:16pm

24

QUANTUM THEORY

24.4

Klein-Gordon equation

To get a relativistic wave-equation for a free-particle we can consider pµ pµ + m2o c2 = 0

(796)

which gives

where κ = mo c/¯h.

h

i

∂ µ ∂µ − κ2 Ψ(x) = 0

(797)

This equation was first given by Schr¨odinger in 1926 at the same time as the non-relativistic equation. He had originally applied this equation to the hydrogen atom but obtained the incorrect energy spectrum. He did not publish the relativistic theory because it disagreed with experiment. Later it was found to describe pions (π meson) and kaons. It is now known as the Klein-Gordon equation and describes a spin 0 particle.

24.5

Dirac equation

The Dirac 31 equation, published in 1928, can be viewed as a factorisation of the KleinGordon equation. The resulting wave-equation is linear in both space and time derivatives so that space and time are treated in the same way. This gives a wave-equation for a spin 1 2 particle. The Dirac equation for a free particle has the form [γ µ ∂µ + κ] Ψ(x) = 0

(798)

where γ µ are 4 × 4 matrices that satisfy γ µ γ ν + γ ν γ µ = 2η µν

(799)

The wave-function Ψ(x) has 4 components Ψ (α) (x) for α = 1, 2, 3, 4 i.e. it is a column matrix that is multiplied from the left by the matrices γ µ in eq.(798). However Ψ is not a 4-vector. If we multiply the Dirac equation by γ ν ∂ν − κ from the left then, providing the conditions eq.(799) are satisfied, we obtain h

i

∂ µ ∂µ − κ2 Ψ(x) = 0

(800)

so that the Dirac wave-function also satisfies the Klein-Gordon equation. The γ µ anticommute e.g. γ 0 γ 1 = −γ 1 γ 0

(801)

31 Paul Adrien Maurice Dirac (1902–1984), British theoretical physicist, major contributor to quantum mechanics, predicted existence of the positron and of other antiparticles.

185

May 11, 2004 5:16pm

24

QUANTUM THEORY

and γ 0 γ 0 = −1

γ 1γ1 = γ2γ2 = γ3γ3 = 1

and

(802)

p

Note that E = ± p2 c2 + m2o c4 . In classical mechanics we neglect the negative energy case but in quantum mechanics we can have Ψ for E < 0. Quantum mechanics allows transitions between the positive and negative energy states. These solutions correspond to antiparticles e.g. the positron is the antiparticle of the electron. Based on his wave-equation Dirac suggested the existence of these particles and this was confirmed experimentally by Anderson in 1933. The standard (Dirac-Pauli) representation of the Dirac equation involves choosing γµ = (iβ, −iβα)

(803)

where α and β are 4 × 4 matrices. These are the matrices originally introduced by Dirac. They take the following partitioned form β=

I 0 0 −I

!

and

α=

0 σ σ 0

!

(804)

where I is the 2 × 2 unit matrix and σ are the 2 × 2 Pauli matrices σ1 =

0 1 1 0

!

σ2 =

0 −i i 0

!

σ3 =

1 0 0 −1

!

(805)

The Pauli matrices satisfy σi σj + σj σi = iσk

(806)

where i, j and k are cyclic in 1, 2 and 3. Also their squares are the unit matrix. The properties of β and α are that they anticommute in pairs and their squares are unity. The Dirac equation in the standard representation becomes [c α · p + mo c2 β] Ψ(x) = E Ψ(x)

(807)

where p and E are operators.

24.6

Charged particles

For a charged particle, with charge q, moving in an electromagnetic field described by a 4-potential, Aµ = (−Φ, A), the Hamiltonian formulation of classical mechanics indicates that the 4-momentum pµ should be replaced by πµ i.e. pµ → π µ

πµ = pµ − qc Aµ

where

(808)

or when resolved into space and time components E → E − qΦ

and 186

p → p − qc A

(809) May 11, 2004 5:16pm

24

QUANTUM THEORY

Then the Klein-Gordon equation for a charged particle becomes [π µ πµ + m2o c2 ] Ψ(x) = 0

(810)

where πµ is now an operator given by πµ = −i h ¯ ∂µ −

q c

= −i h ¯ ∂µ −

Aµ iq h ¯c

Aµ

(811)

For a charged particle the Dirac equation becomes (γ µ πµ − imo c)Ψ(x) = 0

(812)

q c α · p − A + mo c2 β Ψ(x) = (E − q Φ) Ψ(x) c

(813)

giving

187

May 11, 2004 5:16pm

25

INTRODUCTION TO GENERAL RELATIVITY

25

Introduction to General Relativity

25.1

Absolute space

Newtonian mechanics does not distinguish between inertial frames. Any of the infinite set of inertial frames in uniform motion can be used. According to Newton a particle does not resist uniform motion. However it does resist any change in its velocity. This is expressed by Newton’s second law in which the inertial mass is a measure of the resistance to motion. A natural question to ask is: acceleration relative to what? The practical answer is acceleration relative to any of the inertial frames. This is unsatisfactory. There seems no reason to select inertial frames as standards of non-acceleration Newton answered this by postulating absolute space which is supposed to interact with every particle so as to resist its motion. Absolute space was taken as the frame of the fixed stars. This is an inertial frame. The main objections to absolute space are: it is ad hoc and explains nothing, it cannot be located within the class of inertial frames, it acts but cannot be acted on. While Special Relativity abolished the ether concept it still left the inertial frames of Newtonian mechanics. There is no reason why these frames should constitute a privileged class such that they serve as a standard of non-acceleration and the laws of physics take their simplest form in them. General Relativity is the modern theory of gravitation. Einstein was led to General Relativity by his desire to abolish the role of absolute space from physics. In this he was guided by Mach 32 .

25.2

Mach’s Principle

Newton carried out his famous pail experiment with the idea of showing that the concept of absolute space is correct. He suspended a pail containing water by means of a twisted rope. When the rope began to unwind the water first remained stationary, then began to rotate with its surface having the shape of a paraboloid of revolution, and retained this motion for some time after the pail had stopped rotating. He concluded that the rotation of the water relative to absolute space produces a centrifugal force which gives rise to the paraboloidal surface. However Mach asserted that the motion of the water was relative to all the rest of the matter of the universe and that this matter exerted a force on the rotating water which produced the paraboloidal shape of its surface. The hypothesis that the inertia of a body arises from the presence of all the rest of the matter of the universe is known as Mach’s Principle (1883). Mach’s ideas on inertia can be summarised as: • space is not a ‘thing’ in its own right 32

See Appendix B ‘Odds and ends’ note 18.

188

May 11, 2004 5:16pm

25

INTRODUCTION TO GENERAL RELATIVITY

• a particle’s inertia results from its interaction with all the other masses in the universe • all that matters in mechanics is the relative motion of all the masses. It should be noted that Mach’s Principle is rooted in classical physics and ignores ‘fields’ ( i.e. quantum effects) Mach’s Principle and our knowledge of the universe suggest that the centre of each galaxy provides a local standard of non-acceleration, and the lines of sight to other galaxies provide a local standard of non-rotation. Together this defines a local inertial frame of reference. Extended inertial frames of reference (as in Special Relativity ) do not exist. The same conclusion can be reached using the Equivalence Principle . Of course at each point there are an infinite number of local inertial frames in uniform relative motion.

25.3

Equivalence Principle

Einstein discussed a gedanken (thought) experiment in which there is imagined an observer in a closed box who is subjected to a downward acceleration proportional to mass. Einstein asserted that it is impossible for the observer to decide whether the downward acceleration is due to 1. a mass attached below the box whose gravitational attraction produces the acceleration, or 2. the box is being accelerated upwards by the pull of a rope, as in a lift. This implies that gravity can be transformed away by a change of reference frame like an inertial force. The hypothesis that gravity and inertia are indistinguishable is called the Equivalence Principle and leads to the equivalence of gravitational mass and inertial mass. This has been verified experimentally by E¨otv¨os 33 (1889 and 1922) within an accuracy of 1 parts in 108 and by Dicke 34 (1961) with an accuracy 1000 times better. In Newton’s theory of gravitation the inertial and gravitational masses are in the same proportion and usually made equal by a choice of units. This is known as the Weak Equivalence Principle . It then follows that all particles experience the same acceleration in a given gravitational field i.e. the path followed is independent of the particle. This is known as Galileo’s Principle.

25.4

General Relativity

Einstein’s primary goal was to produce a gravitation theory that incorporated the Equivalence Principle and Special Relativity in a natural way. 33 34

See Appendix B ‘Odds and ends’ note 19. See Appendix B ‘Odds and ends’ note 20.

189

May 11, 2004 5:16pm

25

INTRODUCTION TO GENERAL RELATIVITY

Thus he proposed that all the matter in the universe produces a geometry of space-time in which particles move along curved paths which are geodesics in the space-time continuum. This can be achieved by using a Riemannian geometry, and indeed Riemann conjectured that the particular choice of geometry depends on the distribution of matter. The motion of particles is determined by the geometry of the space-time. The theory then contains Galileo’s Principle as a primary ingredient. Einstein extended his special principle of relativity to all frames of reference and stated that the laws of physics must apply to systems of reference in any kind of motion. Einstein’s general principle of relativity postulates that the general laws of nature are to be expressed by equations which hold good for all systems of coordinates, i.e. they are covariant with respect to any substitutions whatever. The requirement of covariance led Einstein to the use of general tensors in General Relativity . While Mach’s Principle led to the idea that space plays no role in physics and thus does not exist, in General Relativity space in the form of space-time does play a role. Space-time acts on mass (as a guiding field) and is acted upon by mass (suffering curvature). Mach believed that inertial forces are gravitational (due to the other matter in the universe) while Einstein showed that gravitational forces are inertial (space guided).

25.5

Weak Equivalence Principle

If an uncharged test body is placed at an initial event in space-time and given an initial velocity, then its subsequent motion is independent of its internal structure and composition.

25.6

Einstein Equivalence Principle

1. Weak Equivalence Principle is valid. 2. The outcome of any local nongravitational test experiment is independent of the velocity of the (free falling) apparatus. 3. The outcome of any local nongravitational test experiment is independent of where and when in the universe it is performed.

25.7

Metric theory of gravitation

It can be argued that Einstein Equivalence Principle implies a metric theory of gravitation. The properties of such a theory are 1. Space-time is endowed with a metric g µν . 2. The world-lines of test particles are geodesics of that metric. 3. In local freely falling frames, known as local Lorentz frames, the nongravitational laws of physics are those of Special Relativity . 190

May 11, 2004 5:16pm

25

INTRODUCTION TO GENERAL RELATIVITY

A procedure for implementing Einstein Equivalence Principle is: put the local Special Relativity laws into a frame invariant form using Lorentz invariant tensors and make the substitutions ηµν

−→ gµν

(814)

∂µ −→ ∇µ However this can be ambiguous.

For example, covariant differentiation does not commute while partial differentiation does. Therefore the order in which differential operators occur in an equation can be ambiguous. Also we can add terms to the equations and these terms may be vanishingly small in the limit of a flat space. Ultimately, while we have powerful indicators of how the equations may look, only experiment can decide the correct equations.

25.8

Tests of General Relativity

A viable theory for gravitation must 1. be complete (capable of treating all experiments) 2. be self-consistent (predictions are unique) 3. be relativistic (when gravity turns off, results agree with special relativity) 4. have a Newtonian limit (agree with Newtonian theory in the weak-field limit) The following (first 3 are termed ‘classical’) tests have been performed 1. anomalous perihelion shift of mercury (moderate accuracy, disputed in 1960s) 2. deflection of light by the sun (low accuracy) 3. gravitational red shift of light (test of Einstein Equivalence Principle rather than General Relativity ) 4. time-delay of light (proposed in 1960s) 5. cosmology (inconclusive) Since 1960 there has been an enormous amount of work done theoretically and experimentally. Quoting from the text by Clifford Will: 35 35

Theory and experiment in gravitational physics, revised edition, 1993, Cambridge University Press

191

May 11, 2004 5:16pm

25

INTRODUCTION TO GENERAL RELATIVITY

‘In 1992 we find that General Relativity has continued to hold up under extensive experimental scrutiny. The question then arises, why bother to test it further? One reason is that gravity is a fundamental interaction of nature, and as such requires the most solid empirical underpinning we can provide. Another is that all attempts to quantise gravity and to unify it with the other forces suggest that gravity stands apart from the other interactions in many ways, thus the more deeply we understand gravity and its observational implications, the better we may be able to mesh it with the other forces. Finally, and most importantly, the predictions of General Relativity are fixed; the theory contains no adjustable constants so nothing can be changed. Thus every test of the theory is potentially a deadly test. A verified discrepancy between observation and prediction would kill the theory, and another would have to be substituted in its place. Although it is remarkable that this theory, born 77 years ago out of almost pure thought, has managed to survive every test, the possibility of suddenly finding a discrepancy will continue to drive experiments for years to come.’

192

May 11, 2004 5:16pm

26

EQUATIONS OF GENERAL RELATIVITY

26 26.1

Equations of General Relativity Newtonian gravitation

Field due to a single mass Two masses Mg and mg exert an attractive force on each other of magnitude F =G

Mg mg r2

(815)

where r is the separation and G = 6.67 × 10 −11 m3 kg−1 s−2 is the gravitational constant 36 . The masses are known as gravitational masses. We use the subscript g to distinguish the gravitational mass from the mass that occurs in the law of motion. Newton’s second law states that F = mi f

(816)

where mi is now the inertial mass. The Weak Equivalence Principle states that the inertial and gravitational masses are the same. mi = m g

(817)

so that we can drop the subscripts. Thus if r is the position of m relative to M then the acceleration on m due to M is f

M r r3 = −∇ϕ

= −G

(818)

where we have introduced the gravitational potential ϕ due to the mass M ϕ=−

GM r

(819)

36 Of all the fundamental constants in physics this is the most difficult to measure with high precision. Which is interesting ...

193

May 11, 2004 5:16pm

26

EQUATIONS OF GENERAL RELATIVITY

Field due to a mass distribution Suppose that there are n masses. Mass k has position r k and mass Mk . For a particle at position r the acceleration due to the other masses is f (r) = −G

n X

Mk

k=1

= −∇ϕ

r − rk |r − rk |3

(820)

where ϕ(r) = −G

n X

k=1

Mk |r − r k |

(821)

Replacing the discrete masses Mk by a mass density ρ gives f(r) = −G

r − r0 dr 0 |r − r 0 |3

Z

ρ(r 0 )

Z

ρ(r 0 ) dr 0 |r − r 0 |

= −∇ϕ

(822)

where ϕ(r) = −G

(823)

Gauss’s law gives I

S

f · dS = −4πG

Z

ρ(r 0 ) dr 0

(824)

V

where dS is an outwardly pointing surface element of the volume V . In differential form ∇ · f = −4πGρ

(825)

Therefore the gravitational potential ϕ satisfies Poisson’s equation ∇2 ϕ = 4πGρ

(826)

When ρ is zero at a point, then we obtain Laplace’s equation ∇2 ϕ = 0

26.2

(827)

Vacuum field equations of General Relativity

• Space-time is taken to be a Riemannian space of dimension 4. • A particle moving in this space will follow a geodesic path. 194

May 11, 2004 5:16pm

26

EQUATIONS OF GENERAL RELATIVITY

• At any point in space-time it is possible to find a coordinate transformation such that the metric reduces to pseudo-Euclidean form. This is a local Lorentz frame, valid in a small region near the point. The curvature of the space is determined by the metric tensor g µν . In order to determine the metric tensor we require field equations. The vacuum field equations are given by Rµν = 0

(828)

where Rµν is the Ricci tensor. If the space is flat, i.e. gµν is a constant throughout the space, then this equation is satisfied. Thus the pseudo-Euclidean metric of Special Relativity (ds)2 = −c2 (dt)2 + (dx)2 + (dy)2 + (dz)2

(829)

is a solution of the equations as it must be. • Since Rµν is symmetric there are 10 equations. • These are 2nd order, non-linear differential equations for the components of the metric tensor. • It is essential that the equations are non-linear because the gravitational field itself has energy (mass) and can gravitate. This can only be described by non-linear equations. • The 10 equations with appropriate boundary conditions determine the ten components of the metric tensor. This would appear to exclude the possibility of making arbitrary coordinate transformations. However this is still allowed because the equations satisfy 4 differential relationships. Since R µµ = 0 it follows that the Einstein tensor is zero Gµν = 0

(830)

∇µ Gµν = 0

(831)

and we know that

The 10 independent components of the metric tensor are analogues of the Newtonian gravitational potential ϕ. This indicates the degree of complexity involved in General Relativity . Many results ( e.g. gravity waves) can be obtained by considering a linearised theory which in some respects is similar to Maxwell’s theory of electromagnetism.

195

May 11, 2004 5:16pm

26

EQUATIONS OF GENERAL RELATIVITY

26.3

Newtonian limit

Newtonian gravitation is very successful at predicting planetary motion in the solar system. The solar field is weak and we expect the vacuum field equations to reduce to Laplace’s equation in a weak field limit. ∇2 ϕ = 0

−→

Rµν = 0

(832)

The weak field metric is a perturbation of the inertial frame metric. It will take the form

(ds)2 = −(1 + Φ) c2 (dt)2 + (1 − Φ) (dx)2 + (dy)2 + (dz)2

(833)

where Φ = Φ(x, y, z) and does not depend on time. We assume that Φ 1. Taking coordinates (ct, x, y, z) the metric tensor becomes    

gµν = 

−(1 + Φ) 0 0 0 0 (1 − Φ) 0 0 0 0 (1 − Φ) 0 0 0 0 (1 − Φ)

    

(834)

Geodesic equations The non-zero Christoffel symbols are (

0 0 1

)

∂x Φ = 2(1 + Φ)

(

(

1 1 1

)

1 0 0

)

∂x Φ =− 2(1 − Φ)

=

(

0 0 2

)

(

1 2 2

)

(

1 1 2

)

(

∂y Φ = 2(1 + Φ)

=

(

1 3 3

)

=

∂y Φ =− 2(1 − Φ)

)

0 0 3

=

∂z Φ (835) 2(1 + Φ)

∂x Φ 2(1 − Φ) (

(836)

1 1 3

)

=−

∂z Φ (837) 2(1 − Φ)

where ∂x Φ = ∂Φ/∂x, etc. . The symbols with upper index 2 and 3 can be obtained by analogy with those with upper index 1. The geodesic equations are µ

x ¨ +

(

µ α β

)

x˙ α x˙ β = 0

(838)

where x˙ µ = dxµ /dλ.

196

May 11, 2004 5:16pm

26

EQUATIONS OF GENERAL RELATIVITY

Therefore for µ = 1 we obtain d2 x dλ2

∂x Φ = − 2(1 − Φ)

d(ct) dλ

2

+

dy dλ

2

+

dz dλ

2 !

(839)

dx 2 ∂x Φ + 2(1 − Φ) dλ ∂y Φ dx dy + (1 − Φ) dλ dλ ∂z Φ dx dz + =0 (1 − Φ) dλ dλ

We will parameterise the motion using the proper time τ and assume that the particle is moving slowly so that dλ = dτ ≈ dt

(840) since γv −→ 1

The dominant term on the rhs contains c 2 and taking Φ 1 gives c2 ∂Φ d2 x = − dt2 2 ∂x

(841)

Similar equations are obtained for y and z. Compare this with Newtonian gravitation d2 x ∂ϕ =− 2 dt ∂x

(842)

and we obtain Φ =

2ϕ c2

(843)

With this choice for Φ we can reproduce the Newtonian theory. Field equations The Ricci tensor has non-zero components R00 = −

R11 = −

h i 1 1 (1 − Φ2 ) ∇2 Φ − ∇Φ · ∇Φ 2 2 (1 − Φ)(1 − Φ )

h 1 1 (1 + Φ)(1 − Φ2 ) ∇2 Φ + (1 + Φ) (∇Φ · ∇Φ) 2 (1 − Φ2 )2

+2Φ(1 − Φ2 ) ∂xx Φ + (1 + 2Φ + 3Φ2 ) (∂x Φ)2

R12 = −

(845)

i

h i 1 1 2 2 2Φ(1 − Φ ) ∂ Φ + (1 + 2Φ + 3Φ ) (∂ Φ ∂ Φ) xy x y 2 (1 − Φ2 )2

197

(844)

(846)

May 11, 2004 5:16pm

26

EQUATIONS OF GENERAL RELATIVITY

where ∂xx Φ = ∂ 2 Φ/∂x2 , etc. . The other space-space components can be found by analogy. Taking Φ (weak field) and its partial derivatives (slowly moving particles) to be small compared to one, we see that to first order in Φ 1 R00 ≈ − ∇2 Φ 2

1 R11 ≈ − ∇2 Φ 2

and

(847)

with the off-diagonal components zero. The field equations Rµν = 0 in the weak field limit for slowly moving particles gives Laplace’s equation ∇2 Φ = 0

(848)

We have assumed a weak field and slow moving particles. This is appropriate for planetary motion in the solar system since |Φ| < 5 × 10−6

(849)

throughout the external field of the sun and 2

v c

< 3 × 10−8

(850)

for planetary motion.

26.4

Gravitational red-shift

We can use the weak field metric

(ds)2 = −(1 + Φ) c2 (dt)2 + (1 − Φ) (dx)2 + (dy)2 + (dz)2 to look at the gravitational red-shift.

(851)

In the field the ‘proper time’ interval is given by (dτ )2 = −

(ds)2 c2

with dx = dy = dz = 0

(852)

so that √ 1 + Φ dt r 2ϕ = 1 + 2 dt c

dτ

=

dτA =

r

(853)

At position A 1+

2ϕA dt c2

198

(854) May 11, 2004 5:16pm

26

EQUATIONS OF GENERAL RELATIVITY

and at B dτB =

r

1+

2ϕB dt c2

(855)

The coordinate time interval dt is the same throughout space. Suppose n waves of frequency νA are emitted at A in the proper time ∆τ A , i.e. , n = νA ∆τA

(856)

The frequency observed at B is νB so that n = νA ∆τA = νB ∆τB

(857)

Therefore νB

= νA

∆τA ∆τB

v u u1 + = νA t

(858) 2ϕA c2 2ϕB c2

1+

' νA

1+

ϕA − ϕ B c2

This can be written as ∆ϕ ∆ν =− 2 νA c

(859)

with ∆ν = νB − νA and ∆ϕ = ϕB − ϕA . In terms of the wavelength λ where c = λν ∆ϕ ∆λ = 2 λA c

(860)

with ∆λ = λB − λA . Therefore remembering that ϕ = −GM/r, the potential varies from zero far from the object (weak field) to a large negative value near the object (strong field). If the source A is in a stronger field than the observer B then we have ∆ϕ > 0 giving an increase in the observed wavelength, ∆λ > 0.

26.5

Examples of gravitational red-shift

Three examples are: • If A is the sun and B is the earth, then light emitted from the sun will increase in wavelength, i.e. , be red-shifted. The wavelength of visible light is measured in units of ˚ Angstroms (1˚ A=10−10 m) and the spectrum is 199

May 11, 2004 5:16pm

26

EQUATIONS OF GENERAL RELATIVITY

··· ···

red-light 7600 ˚ A

blue-light 4000 ˚ A

An increase in wavelength moves the visible light towards the red end of the spectrum. This has been observed for the D1 lines of sodium but it is a small effect. |∆ϕ| ' 2.1 × 10−6 c2

(861)

The accuracy is limited because you must separate out the effects of the conventional Doppler effect due to the motion of the source. • Pound and Rebka (1960) and later Pound and Synder (1965) used the M¨ossbauer effect to measure the shift of the γ-ray line from 57 Fe in the Earth’s gravitational field. The γ-rays travelled from the basement to the roof of the Jefferson Physical Laboratory at Harvard about 20m. An accuracy of 1% was achieved. • If we regard vibrating (light-emitting) atoms as clocks, then the rate at which clocks tick depends on their position in a gravitational field. The Global Positioning System (GPS) relies on synchronisation between the clocks on a satellite and at the surface of the Earth. GPS is discussed in Appendix G.

26.6

Gravitational red-shift using Einstein Equivalence Principle

The gravitational red-shift can be shown to follow from the Einstein Equivalence Principle . Consider two boxes separated by a vertical distance z and both subject to a uniform upward acceleration f . The lower box emits a photon of wavelength λ o . After a time ∆t = z/c the photon is detected by the upper box. However in this time there is an additional speed given by ∆v = f ∆t = f z/c. From the viewpoint of the observer, the source of the photon appears to be moving away with this speed. Therefore the conventional Doppler effect predicts a shift in wavelength ∆λ = λ − λo given by ∆λ λo

= =

∆v c fz c2

(862)

By the Einstein Equivalence Principle we cannot distinguish this scenario from the case of a constant downward gravitational field. Then f is the acceleration due to gravity. The change in potential is ∆ϕ = f z giving ∆ϕ ∆λ = 2 λo c

(863)

as before.

200

May 11, 2004 5:16pm

26

EQUATIONS OF GENERAL RELATIVITY

26.7

Energy-momentum tensor

• Consider a field of non-interacting matter. No forces act. Then T µν = ρ v µ v ν

(864)

is the energy-momentum tensor of the field. It is symmetric. Here ρ is the scalar proper density i.e. the density which an observer moving with the field would observe and µ v µ = dx dτ is the 4-velocity field. Using Minkowski coordinates xµ = (ct, r) we have v µ = γv (c, v) where γv is the Lorentz factor. Then we have 

T µν

1

    2 2 = γv ρ c    

vx c

vy c

vz c

vx2 c2

vx vy c2

vx vz c2

vy2 c2

vy vz c2

         

vz2 c2

(865)

Thus T 00 = γv2 ρ c2 . The mass of moving matter increases by a factor γ v and the volume element is decreased, due to Lorentz contraction, by a factor γ v . Therefore γv2 ρ is the density observed by a fixed observer. Thus we see that T 00 is the relativistic energy density. In the limit of small velocities (γv ≈ 1) the dominant term is T 00 ≈ ρ c2 . We find that ∂ν T 0ν

= c = 0

∂ρ + ∇ · (ρv) ∂t

(866)

This is the continuity equation from hydrodynamics. It expresses conservation of matter ( i.e. energy). Also ∂ν T

1ν

= ρ = 0

∂ + v · ∇ vx ∂t

(867)

which is the equation for force-free hydrodynamic flow. It expresses conservation of momentum. • If the matter field has an internal pressure then T

µν

p = ρ+ 2 c

v µ v ν + p g µν

(868)

where p is the scalar pressure field. A perfect fluid is characterised by two scalar fields, the density ρ and the pressure p. For slowly moving fluids and small pressures the equations ∂ν T µν = 0 201

(869) May 11, 2004 5:16pm

26

EQUATIONS OF GENERAL RELATIVITY

reduce to the continuity equation ∂ρ + ∇ · (ρv) = 0 ∂t

(870)

and Euler’s equation of motion: ρ

∂ +v·∇ ∂t

v = −∇p

(871)

Thus we have the relations ∂ν T µν = 0

(872)

which express conservation of mass and momentum. On covariance grounds we generalise to ∇ν T µν = 0

(873)

It appears that physical quantities interact in such a way that the divergence of T µν is zero i.e. total energy and momentum are conserved.

26.8

Field equations of General Relativity

We are looking for an equation such that tensor representing geometry of space

!

=

tensor representing energy content

!

(874)

It must 1. reduce to the free-space (vacuum) field equations when the energy content is zero 2. reduce to Poisson’s equation for a weak field and low velocities i.e. ∇2 ϕ = 4 π ρ G

(875)

In forming the field equations the tensor which represents the geometry of space-time must have zero divergence. A tensor which satisfies this is the Einstein tensor. The field equations become Gµν = −κ T µν

(876)

where κ is a constant. In terms of the Ricci tensor we have Rµν −

1 2

g µν Rαα = −κ T µν 202

(877) May 11, 2004 5:16pm

26

EQUATIONS OF GENERAL RELATIVITY

26.9

Alternative forms for the field equations

This can also be written as Rνµ −

1 2

δνµ Rαα = −κ Tνµ

(878)

Rµν −

1 2

gµν Rαα = −κ Tµν

(879)

and

Contracting the mixed version of the field equation eq.(878) gives Rαα = κ Tαα

(880)

so that the field equations eq.(877) can be rewritten as

Rµν = −κ T µν −

1 2

g µν Tαα

Clearly this reduces to R µν = 0 when T µν = 0.

26.10

(881)

Identification of the constant κ

κ is identified by reducing the field equation to Poisson’s equation in the limit of a weak field and low velocities. In the weak field limit the dominant equation has µ = ν = 0

R00 = −κ T 00 −

1 2

≈ −κ T 00 −

1 2

≈ − 21 κ T 00

g 00 Tαα

g 00 g00 T 00

(882)

≈ − 21 κ ρ c2 Since R00 ≈ − c12 ∇2 ϕ we have ∇2 ϕ =

1 2

κ ρ c4

(883)

If this is to agree with Poisson’s equation ∇ 2 ϕ = 4πGρ then we must have κ=

8πG c4

203

(884)

May 11, 2004 5:16pm

27

BLACK-HOLES

27

Black-holes The black holes of nature are the most perfect macroscopic objects there are in the universe: the only elements in their construction are our concepts of space and time. And since the general theory of relativity provides only a single unique family of solutions for their descriptions, they are the simplest objects as well. The unique two-parameter family of solutions which describes the space-time around black holes is the Kerr family discovered by Roy Patrick Kerr in July, 1963. The two parameters are the mass of the black hole and the angular momentum of the black hole. The static solution, with zero angular momentum, was discovered by Karl Schwarzschild in December, 1915. A study of the black holes of nature is then a study of these solutions. It is to this study that this book is devoted. from ‘The Mathematical Theory of Black Holes’ by S Chandrasekhar, Oxford University Press, 1982.

27.1

Schwarzschild solution

The most important solution to the vacuum field equations Rµν = 0

(885)

was obtained by Schwarzschild 37 in 1916. He considered the case of a static, spherically symmetric field. Static means independent of time with motions reversible in time. Birkhoff (1923) showed that it was unnecessary to assume the field is static. The only criteria are spherical symmetry and that the field vanishes at infinity i.e. , it is asymptotically flat. Thus the metric also describes the space-time surrounding a pulsating spherical mass. The Schwarzschild metric is given by (dr)2 + r 2 (dθ)2 + r 2 sin2 θ (dφ)2 1+Φ (dr)2 = −(1 + Φ) c2 (dt)2 + + r 2 (dΩ)2 1+Φ

(ds)2 = −(1 + Φ) c2 (dt)2 +

(886)

with Φ=−

2m r

and

m=

GM c2

(887)

1. The space part of the metric is expressed in spherical polar coordinates (r, θ, φ) which reflect the spherical symmetry of the problem. The function Φ depends only on r. The angular part of the space-metric is unchanged from the Euclidean form, only the radial part has been modified. 37

See Appendix B ‘Odds and ends’ note 21.

204

May 11, 2004 5:16pm

27

BLACK-HOLES

2. The weak-field metric is obtained by taking Φ ≈ 0 in the second term on the rhs. 3. The constant m has the dimension of distance. 4. As r → ∞ then Φ → 0 and the metric reduces to the inertial frame metric, i.e. , the solution is asymptotically flat. (ds)2 −→ −c2 (dt)2 + (dr)2 + r 2 (dθ)2 + r 2 sin2 θ (dφ)2

(888)

5. The radius r = 2m is known as the Schwarzschild radius or event horizon. At this radius Φ = −1 and there appears to be a singularity in the metric. However this is due to the choice of coordinates. It is possible to choose coordinates in which there is no singularity at this radius. At this radius the gravitational field is very strong. The coordinates in eq.(886) are appropriate when the gravitational field is weaker. 6. For most ordinary bodies the Schwarzschild radius lies well inside the surface of the body. The Schwarzschild metric is a solution only to the vacuum field equations. The metric does not apply within the body. In the following table ro is the radius of the body. A black-hole corresponds to the case ro < 2m. For a neutron star the value of 2m is getting close to r o and we can easily conceive of bodies with a higher density than a neutron star.

proton earth sun neutron star galaxy universe

27.2

2m 2.47 × 10−52 cm 0.89 cm 2.95 km 4.5 km 1012 km

2m/ro 1.75 × 10−39 1.4 × 10−9 4.2 × 10−6 0.57 10−6 ≈1

Time-like geodesics

The Euler-Lagrange equations are d dλ

∂L ∂ x˙ µ

−

∂L =0 ∂xµ

(889)

where L = L(x˙ µ , xµ , λ) is the Lagrangian, λ is a parameter describing the curve x µ (λ) and x˙ µ = dxµ /dλ. For a time-like geodesic, such as is followed by a particle in General Relativity , the Lagrangian is ds L=− dλ

2

(890)

Along the time-like geodesic we must have (ds) 2 < 0.

205

May 11, 2004 5:16pm

27

BLACK-HOLES

For the particular case of the Schwarzschild metric L = (1 + Φ)c2 t˙2 −

r˙ 2 − r 2 θ˙ 2 − r 2 sin2 θ φ˙ 2 1+Φ

(891)

where Φ = −2m/r.

The equation for t gives : d 2(1 + Φ)c2 t˙ = 0 dλ

The equation for φ gives : d −2r 2 sin2 θ φ˙ = 0 dλ

−→

(1 + Φ)t˙ = a

−→

r 2 sin2 θ φ˙ = b

where a is a constant

(892)

where b is a constant (893)

The equation for θ gives :

d −2r 2 θ˙ = −2r 2 sin θ cos θ φ˙ 2 dλ

This becomes

(894)

2 (895) θ¨ + r˙ θ˙ − sin θ cos θ φ˙ 2 = 0 r If we choose initial conditions θ = π2 and θ˙ = 0 then initially and always θ¨ = 0. The particle will continue to move in the plane θ = π2 . We make this choice and it follows that a t˙ = 1+Φ

b φ˙ = 2 r

and

(896)

Although we could determine the Euler-Lagrange equation for r it is easier to obtain a radial equation by considering the line-element directly. This gives a first order differential equation, in effect the first integral of the second order Euler-Lagrange equation. The metric is eq.(886) (dr)2 + r 2 (dθ)2 + r 2 sin2 θ (dφ)2 (897) 1+Φ For a time-like geodesic we can choose the proper time τ as the parameter describing the curve i.e. , let λ = τ . (ds)2 = −(1 + Φ) c2 (dt)2 +

c2 (dτ )2 = −(ds)2 c2 = (1 + Φ) c2 t˙2 −

(898)

r˙ 2 − r 2 θ˙ 2 + sin2 θ φ˙ 2 1+Φ

(899)

r˙ 2 − r 2 φ˙ 2 1+Φ

(900)

where now t˙ = dt/dτ etc. . This is an equation that must be satisfied by the path of the particle. Choosing θ = π2 gives c2 = (1 + Φ) c2 t˙2 − Using eq.(896) and rearranging gives

r˙ 2 = c2 a2 − (1 + Φ) c2 + 206

b2 r2

(901) May 11, 2004 5:16pm

27

BLACK-HOLES

27.3

Orbit equation

We want to express eq.(901) in terms of u = 1/r as a function of φ. This allows easy comparison with the corresponding orbit equation from Newtonian theory. Consider dr dr du dr du dφ = = dτ du dτ du dφ dτ

r˙ =

(902)

Using dr 1 =− 2 du u

and

du = u0 dφ

dφ = bu2 dτ

and

(903)

gives r˙ = −bu0

(904)

Substituting into eq.(901) and using Φ = −2mu gives (u0 )2 =

c2 a2 − c2 2mc2 + 2 u − u2 + 2mu3 b2 b

(905)

Now differentiate with respect to φ 0 00

2u u =

2mc2 − 2u + 6mu2 b2

!

u0

(906)

The case u0 = 0 corresponds to a circular orbit. Excluding this case we can divide by u 0 and obtain the orbit equation u00 + u =

mc2 + 3mu2 b2

(907)

GM h2

(908)

The Newtonian orbit equation is u00 + u =

where h = r 2 dφ/dt is the angular momentum per unit mass. 1. In the weak field limit, the particle moves slowly v c, and then dτ ≈ dt. Therefore dφ dτ dφ ≈ r2 dt = h

b = r2

(909)

mc2 GM ≈ 2 2 b h

(910)

and

which agrees with the Newtonian orbit equation. 207

May 11, 2004 5:16pm

27

BLACK-HOLES

2. The extra term on the rhs of eq.(907) is 3mu 2 . Comparing this to the constant term mc2 /b2 gives the ratio 3b2 u2 3h2 u2 3mu2 = ≈ mc2 /b2 c2 c2

(911)

This quantity is ≈ 3 × 10−8 for the orbit of the earth. For weak fields, such as the solar field, the term 3mu2 is a small correction to the Newtonian orbit equation.

27.4

Advance in the perihelion of planets

Although the term 3mu2 is a small correction in the solar system its effect has been observed. We will treat it as a small perturbation. Let u0 be a solution of the Newtonian orbit equation u00 + u =

GM b2

(912)

This is well-known to be u0 =

GM (1 + e cos φ) b2

(913)

i.e. a conic section with focus at r = 0 and eccentricity e. To solve the orbit equation u00 + u =

GM 3GM 2 + u 2 b c2

(914)

let u = u0 + u1 . Since the extra term is small we expect u 1 to be small compared to u0 . Therefore u000 + u0 + u001 + u1 =

GM 3GM + (u0 + u1 )2 2 b c2

(915)

Since u0 is a solution of eq.(912) and u1 u0 u001 + u1 = ≈

3GM (u0 + u1 )2 c2 3GM 2 u c2 0

(916)

Therefore u1 is a solution of u00 + u =

3(GM )3 (1 + 2e cos φ + e2 cos2 φ) c2 b4

(917)

To complete the solution you need the following particular integrals where A is a constant

u00 + u =

 A    

A cos φ

−→ u = A

−→ u = 21 Aφ sin φ

    A cos2 φ −→ u = 21 A(1 −

208

1 3

(918) cos 2φ) May 11, 2004 5:16pm

27

BLACK-HOLES

Therefore 3(GM )3 1 2 1 1 + eφ sin φ + e (1 − cos 2φ) 2 3 c2 b4 Although all terms are small it is not possible to ignore the second one i.e.

u1 =

(919)

3(GM )3 eφ sin φ (920) c2 b4 This has a continuously increasing effect, due to φ, that will be eventually noticed. Therefore we drop all terms in u1 except this one and obtain u = = ≈

GM 3(GM )3 (1 + e cos φ) + eφ sin φ b2 c2 b4 ! 3(GM )2 GM eφ sin φ 1 + e cos φ + b2 c2 b2 GM b2

1 + e cos

"

3(GM )2 1− c2 b2

(921)

! #!

φ

Thus u is a periodic function of φ with period 2π 3(GM )2 c2 b2

1−

> 2π

(922)

The effect of the term 3mu2 is to increase the period. The amount of this increase is ∆ =

2π 3(GM )2 c2 b2 (GM )2

1−

≈ 6π

− 2π

(923)

c2 b2 The orbit is an ellipse which precesses (rotates) about a focus by an amount ∆. We can relate ∆ to the eccentricity e and the length of the semi-major axis a of the ellipse. Consider GM (1 + e cos φ) b2 The length of the major axis is 2a. The distance between the foci is 2ae. Then u0 =

(924)

GM 1 = 2 (1 + e) r1 b

when

φ = 0 (at perihelion)

(925)

GM 1 = 2 (1 − e) r2 b

when

φ = π (at aphelion)

(926)

and

Adding these equations gives 2GM b2

= = =

1 1 + r1 r2 1 1 + a(1 − e) a(1 + e) 2 a(1 − e2 ) 209

(927)

May 11, 2004 5:16pm

27

BLACK-HOLES

Therefore ∆≈

c2

6πGM a(1 − e2 )

(928)

The following table compares the precession of planetary objects calculated using this formula with the observed precessions. The data is taken from the text by Adler, Bazin & Schiffer. The units are seconds of arc per century which illustrates the smallness of the effects. Planetary objects have a much larger precession than is quoted here. Most of this precession can be explained by Newtonian perturbation theory, arising say from the effects of a nearby planetary object. The figures quoted are the unexplained precession, e.g. the observed precession for Mercury is 5600 arc second per century of which all but 43 arc second could be explained. Einstein’s General Relativity accounted for the outstanding amount. planet Mercury Venus Earth Icarus (asteroid)

calculated shift 43.03 8.6 3.8 10.3

observed shift 43.11 ± 0.45 8.4 ± 4.8 5.0 ± 1.2 9.8 ± 0.8

Mercury lies closest to the sun and so experiences the strongest gravitational field. The effect is thus largest for it. The effect is also large for Icarus because its eccentricity is close to 1. The planets generally have small eccentricities, i.e. nearly circular orbits.

27.5

Null geodesics

The path of a light ray is a null geodesic so that (ds) 2 = 0. We cannot use τ as a parameter, so just leave the parameter as λ. Starting from the metric we have the radial equation 0 = (1 + Φ) c2 t˙2 −

r˙ 2 − r 2 θ˙ 2 + sin2 θ φ˙ 2 1+Φ

where now t˙ = dt/dλ etc. . Choosing θ =

π 2

(929)

gives

0 = (1 + Φ) c2 t˙2 −

r˙ 2 − r 2 φ˙ 2 1+Φ

(930)

The analysis follows through in the same way except the 1 on the lhs is replaced by 0. The resulting orbit equation is u00 + u = 3mu2

210

(931)

May 11, 2004 5:16pm

27

BLACK-HOLES

27.6

Deflection of a light ray near the sun

The Newtonian equation is u00 + u = 0

(932)

For weak fields, such as the solar field, the term 3mu 2 is a small correction to the Newtonian orbit. Let u0 be the solution of eq.(932) then u0 =

1 sin(φ + δ) r0

(933)

where r0 and δ are constants. We will choose δ = 0 so that the solution can be written as r sin φ = r0 . This is the equation of straightline whose perpendicular distance from the origin is r0 . Using perturbation theory as before we seek a solution u 1 to the equation u00 + u =

3m sin2 φ r02

(934)

To complete the solution you need the following particular integral where A is a constant u00 + u = A sin2 φ −→ u = 21 A(1 + 31 cos 2φ)

(935)

Then u=

3m 1 sin φ + 2 (1 + 31 cos 2φ) r0 2r0

(936)

When r becomes large, then u → 0 and since φ is small we obtain 0≈

3m 1 φ + 2 (1 + 13 ) r0 2r0

(937)

This gives φ≈−

2m r0

(938)

Owing to symmetry the total deflection becomes 4m/r 0 . For a light ray grazing the sun this amounts to 1”.75, a value in excellent agreement with experiment 38 . The light ray as it grazes the sun is deflected towards the sun, much like the planets are attracted. We can easily conceive of the mass being so large that the light ray could be bent into a circle around the object. A circular orbit is u = constant so that using u00 + u = 3mu2

(939)

r = 1/u = 3m

(940)

we have

If k = 2m is the Schwarzschild radius then the light will orbit at a radius of 3k/2. 38

See Appendix B ‘Odds and ends’ note 22.

211

May 11, 2004 5:16pm

27

BLACK-HOLES

27.7

Black-holes

We have considered planetary motion in the weak field of the solar system. This corresponds to r 2m. Near r = 2m the Schwarzschild solution exhibits a number of interesting features. The Schwarzschild metric is given by (ds)2 = −(1 + Φ) c2 (dt)2 +

(dr)2 + r 2 (dθ)2 + r 2 sin2 θ (dφ)2 1+Φ

(941)

where Φ=−

2m r

and

m=

GM c2

(942)

Consider two observers, A and B. A is at rest far away i.e. r → ∞. The proper time interval, dτA , is then dτA = dt

(943)

The coordinate time, t, is the proper time for an observer at rest far from the black-hole. Suppose B is at rest just outside the radius r = 2m. Then dτB

=

1−

2m rB

=

1−

2m rB

1

2

1

2

dt

(944)

dτA

Since rB ≈ 2m, we have dτB dτA . The clock at infinity must run for a long time before the clock near r = 2m shows any significant change. If B were at r = 2m then observer A would see his clock stand still forever. Remembering the relationship between proper time and the gravitational red-shift, this is clearly the ultimate example of red-shifting. Within the Schwarzschild radius (r < 2m) there is a change in the character of the coordinates i.e. (dr)2 2 2 2 2 2 2 2 (ds)2 = − 2m + ( 2m r − 1) c (dt) + r (dθ) + r sin θ (dφ) − 1 r

(945)

The metric coefficient of (dr)2 is negative while that of (dt)2 is positive. It looks as if the roles of r and t have been interchanged. For a particle or photon we must have (ds) 2 ≤ 0. Clearly r =constant (dr = 0) is not possible for a particle or photon within the Schwarzschild radius. Since dr 6= 0, r must either always increase or always decrease. For a black-hole formed by collapsing matter, acting as a boundary condition, we can only have the case that r always decreases. Therefore nothing can escape (neither particle nor photon) from within r = 2m. Hence the name ‘black-hole’ for this region of space-time. The Schwarzschild radius r = 2m is known as the ‘event horizon’. If r behaves like ‘time’ then the metric is no longer static, rather the coefficients are ‘time-dependent’. 212

May 11, 2004 5:16pm

27

BLACK-HOLES

27.8

Eddington form

The standard form of the Schwarzschild metric has a singularity at r = 2m in the spacemetric. This is not a singularity in the space-time but is due to the choice of coordinates. We can transform to a non-singular metric. The transformation is due to Eddington (1924). r ct = ct + 2m ln − 1 2m

(946)

The modulus allows us to deal with the two cases, r > 2m and r < 2m. Then c dt = c dt +

dr −1

(947)

r 2m

The Eddington form of the metric is 2m 2 (c dt + dr) (948) r The first four terms on the rhs are the inertial frame metric. There is now no singularity at r = 2m but the last term contains a cross term that makes the metric difficult to interpret. (ds)2 = −c2 (dt)2 + (dr)2 + r 2 (dθ)2 + r 2 sin2 θ (dφ)2 +

27.9

Radial motion

For radial motion we have dθ = dφ = 0 and the Eddington form becomes 2m 2 (c dt + dr) r 2m 4mc 2m 2 2 (dt) (dr) c (dt) + 1 + (dr)2 + = − 1− r r r

(ds)2 = −c2 (dt)2 + (dr)2 +

(949)

We can use this to obtain the slope of infinitesimal light cones. These satisfy (ds) 2 = 0. Therefore we have a quadratic in dt. This can be solved to give the two solutions c(dt) = −1 dr

or

r + 2m r − 2m

(950)

Ingoing light rays have speed c. Outgoing light rays have a speed that varies from c at r = ∞ to 0 at r = 2m. Within r = 2m only ingoing light rays are possible. We can use these results to construct the pattern of light cones in a Schwarzschild metric. They have different degrees of opening and angles of tilt to the time axis. This reflects the curvature of space-time. For large values of r the light cones agree with the results of Special Relativity . However as r decreases towards the event horizon r = 2m the deviation becomes more marked. At r = 2m the outside edge of the light-cone is parallel to the time-axis. A material particle must move within the light-cone (on a time-like world-line) while a photon moves on the light-cone. The pattern of light cones then allows us to construct the possible particle and photon paths. It is clear that only a photon can be at rest on the event horizon. Also within the event horizon any particle or photon must move with decreasing radial coordinate r. Clearly no information can be sent out from within the event horizon. 39 39 Stephen Hawking has predicted that photons can escape from a black-hole (Hawking radiation) due to quantum mechanical effects.

213

May 11, 2004 5:16pm

28

COSMOLOGY

28 28.1

Cosmology Introduction

Gravity is recognised as the fundamental force that binds together the solar system and galaxies. It is natural to assume that the large scale motion of the Universe is controlled primarily by gravity and to attempt to understand this using general relativity. As early as 1917 Einstein proposed a model for the Universe. This described an isotropic, homogeneous, unbounded but spatially finite static Universe. At this time the Universe was supposed to consist of our galaxy and presumably a void beyond 40 . The Andromeda Nebula had not yet been certified to lie beyond the Milky Way. The model is static because no large-scale galactic motions were yet known to exist.

28.2

A little astronomy

The main source of visible light in the Universe is nuclear fusion within stars. The Sun is a typical star and weighs about 2 × 1030 kg. This is known as a solar mass M . The nearest stars to us are a few light years away. Stellar distances are measured in light years (ly) or parsecs (pc) with 1 pc = 3.09 × 1016 m = 3.26 ly

(951)

1 ly = 9.46 × 1015 m = 0.307 pc

(952)

and

In cosmology the smallest considered unit is the conglomeration of stars known as a galaxy. Our solar system lies about 8 kpc from the centre of a disk structure called the Milky Way galaxy. It contains about 1011 stars with masses ranging from 0.1 to 10 M . The structure consists of a central bulge surrounded by a disk of radius 12.5 kpc and thickness 0.3 kpc. The disk rotates slowly and differentially, faster near the centre and slower at the outer edge. At our radius the period is 200 Myr. The disk structure is surrounded by smaller collections of stars, known as globular clusters. These are distributed symmetrically about the bulge at distances of 5–30 kpc. They each contain about 106 stars. 40 The idea that nebulae are other ‘Milky Ways’ goes back to the very first suggestions that the Milky Way itself is a system of stars in which our solar system is embedded. Both ideas were first put forward by Thomas Wright of Durham, in his An Original Theory of the Universe (1750), although his models of the structure of the Milky Way were very different from our present one, and strongly influenced by his religious ideas. Immanuel Kant, the famous philosopher, read a slightly misleading review of Wright’s work and this inspired him to produce a model very similar to our present idea of the universe in his Universal Natural History and Theory of the Heavens (1755). Independently, Johann Lambert suggested a similar theory in 1761. These ideas were debated by astronomers for over 170 years; Hubble’s achievement in 1924 was to decisively end the debate by measuring the distance to the Andromeda Nebula and hence showing that it and the other spirals were outside our own Milky Way.

214

May 11, 2004 5:16pm

28

COSMOLOGY

Taken together the disk structure and globular clusters are embedded in a larger spherical structure known as the galactic halo. In cosmology the detailed structure of a galaxy is ignored and they are treated as point-like objects emitting light. The Milky Way is one of the largest galaxies in a concentrated group of galaxies called the local group. The nearest galaxy is a small irregular galaxy known as the Large Magellanic Cloud which is 50 kpc distant from the Sun. The nearest large galaxy is the Andromeda Nebula at a distance of 770 kpc. A galaxy group occupies a volume of a few cubic Mpc. The Mpc is the unit of choice for measuring distances in cosmology. It is roughly the separation between neighbouring galaxies. Over a scale of 100 Mpc, one sees a variety of large-scale structures. This includes galaxy clusters, grouped into superclusters, perhaps joined by filaments and walls of galaxies. Most galaxies, sometimes called field galaxies, are not part of a cluster. Between this ‘foamlike’ structure lie large voids, up to 50 Mpc across. Only at the scale of hundreds of Mpc does the Universe begin to appear smooth. There seems to be no huge structures beyond galaxy clusters. This smoothness underpins modern cosmology.

28.3

Copernican principle

The Copernican (or cosmological) principle states that the Universe is pretty much the same everywhere. There is no privileged position. In building a simple model of space-time it is assumed that space is homogeneous and isotropic. Homogeneity Homogeneity means that the metric is the same throughout space. Mathematically it means invariance under translations The distribution of galaxies appears to be uniform over large regions of space. This appears to be true over distances of the order of hundreds of Mpc. The assumption of homogeneity is of course greatly simplifying in any model. Anything which holds locally will be representative of the whole. Isotropy Isotropy means that space looks the same in all directions. Mathematically it means invariance under rotations. The number of galaxies per unit solid angle appears to be the same in all directions. Isotropy is also supported by the 3◦ K microwave background radiation, with deviations of the order of 10−5 or less.

215

May 11, 2004 5:16pm

28

COSMOLOGY

Isotropy is not as crucial as homogeneity in simplifying models. It is possible to assume different expansion rates in different directions and this may be required in the early stages of a ‘big-bang’ model. Isotropy everywhere implies homogeneity. While isotropy at one point and homogeneity imply isotropic everywhere. Observational data suggests isotropy on the Earth and the Copernican principle implies that we have no privileged position. Combining gives isotropy and homogeneity throughout space. This applies only on the very largest scales, local variations are essentially averaged over.

28.4

Red-shift

There is a red-shift z=

∆λ λ − λo = λo λo

(953)

for the wavelength of light emitted by galaxies. Here λ o is the wavelength at source while λ is the wavelength observed on Earth. An increase in wavelength gives z > 0 and corresponds to a shift towards the red end of the spectrum. This is interpreted as motion of the source away from the observer. The Doppler shift 41 is given by v (954) z= c for v c where v is the speed of recession.

Hubble

42

in the late 1920s discovered a linear relationship between the red-shift z and the

41

‘To get the velocity of recession of a galaxy from its redshift, you should use the formula given by special relativity.’ This is frequently claimed, but it makes no sense. The special relativistic Doppler formula allows you to calculate the velocity in your local inertial frame of a moving source of radiation. But in cosmology, you cannot extend a local inertial frame from the observer to a distant galaxy: the whole point of curved space-time is that different local frames are needed around each event. One has to be quite careful about how to define ‘velocity’ in this case, but there is a perfectly sensible definition using the so-called metric distance, which is the distance you would read from a tape measure running between our Galaxy and the other. The relation between rate of change of metric distance and redshift depends on how the expansion of the universe is accelerating, but one thing to note is that the metric distance to very distant galaxies can certainly increase faster than the speed of light, whereas the special relativistic formula will never give this result. 42 Edwin Powell Hubble (1889–1953), American astronomer and cosmologist, discovered expansion of the Universe, and measured its size and age. The first person to measure the Doppler shift of a galaxy was Vesto M. Slipher, an astronomer working for Percival Lowell at the observatory in Flagstaff, Arizona which Lowell had set up mainly to observe the ‘canals’ on Mars. The first galaxy Slipher observed, in 1913, was the Andromeda Nebula (as it was then known), which turned out to have a huge blueshift of 300 km s−1 (at that time, the largest speed ever measured). Over the next few years Slipher measured many more spiral nebulae and found that they were nearly all redshifted. At this time the debate about the nature of spiral nebulae was in full swing and Slipher’s data was enough to convince some ( e.g. Arthur Eddington) that the nebulae were other galaxies. Hubble himself hardly ever measured redshifts; he relied on the results of Slipher and of Hubble’s colleague at the Mt. Wilson Observatory, Milton Humason. Hubble’s contribution was to find ways of measuring distances to the spirals that did not depend on the redshift, which allowed him to show that the redshift did increase linearly with distance (Hubble’s law).

216

May 11, 2004 5:16pm

28

COSMOLOGY

distance d z = Ho

d c

(955)

where the constant of proportionality, H o , is Hubble’s constant. This implies the simple law that the speed of recession is proportional to distance so that v = Ho d

(956)

In the models which we will discuss the Hubble constant is related to the expansion rate of the Universe. Hubble’s measurements of Ho began at 550 km s−1 Mpc−1 . A number of systematic errors were identified and by the 1960s Ho had dropped to 100 km s−1 Mpc−1 . Over the last two decades controversy surrounded Ho , with measurements clustered around 50 km s −1 Mpc−1 and 90 km s−1 Mpc−1 . Recent progress due to the calibration of standard candles by the Hubble Space Telescope gives a general consensus that H o is (67±10) km s−1 Mpc−1 , where the error is both statistical and systematic. The inverse of the Hubble constant — the Hubble time — sets a timescale for the age of the Universe: Ho−1 = (15 ± 2) Gyr. This value implies a current expansion of the radial distances by 1% every 150 Myr.

28.5

Background microwave radiation

In 1965 Penzias and Wilson accidentally discovered isotropic radiation corresponding to black-body radiation of about 2.7◦ K. Such a background radiation had been predicted by Alpher and Herman in 1948 on the basis of Gamow’s ‘big-bang’ model 43 .

28.6

Age of Universe

Radioactive dating gives the age of meteoric matter as at least 4 Gyr and terrestrial matter at about (4.5 ± 0.3) Gyr. The oldest globular star clusters have an age of 11.5 Gyr plus 1–2 Gyr to allow for their formation. The abundance ratios of radioactive isotopes produced in stellar explosions provide a lower limit to the age of the Galaxy of 10 Gyr. 43 George Gamow (1904–1968), Soviet–American physicist, explained helium abundance in Universe, suggested DNA code of protein synthesis. Gamow contributed a great deal to our understanding of the ‘big bang’; in fact he was the first to seriously try to calculate the physics of the early universe. But the first prediction that there should be relic radiation left over from the ‘big bang’ was made by Ralph Alpher and Robert Herman, in 1948. At the time, Alpher was Gamow’s Ph.D. student, and Herman was a close collaborator of the two of them, so most people assume that Gamow’s name was on the paper, but it wasn’t.

217

May 11, 2004 5:16pm

28

COSMOLOGY

28.7

Robertson-Walker metric

We will look at models based on the Robertson-Walker metric. This metric complies with the Copernican principle i.e. it is spatially homogeneous and isotropic. Observations appear to support the ‘big-bang’ model of an expanding Universe and as we will see this is embodied in the Robertson-Walker metric. These models were first studied by Friedmann 44 in 1922 and 1924, and independently by Lemaˆitre 45 in 1927. It was not until 1935 that A.G. Walker and H.P. Robertson independently proved that this metric is the only one consistent with a homogeneous, isotropic Universe. The metric is given by 2

2

(ds) = −(dt) + a(t)

2

"

#

(dr)2 + r 2 (dθ)2 + sin2 θ (dφ)2 2 1−kr

(957)

a(t) is a scale factor depending only on the time t. It has the dimensions of length. The parameter k is either 0, 1 or −1. Note that we have chosen units in which c = 1. The radial coordinate r is dimensionless while (θ, φ) are the angular spherical polar coordinates.

28.8

Spatial geometry

Consider the spatial geometry which is given by the metric ( i.e. considering a fixed time t) 2

(dl) = a

2

"

(dr)2 2 2 2 2 (dθ) + sin θ (dφ) + r 1 − k r2

#

(958)

This is the metric for a 3-dimensional Riemannian space of constant curvature. Thus fulfilling the isotropy and homogeneity conditions. In a Riemannian space of dimension N > 2 the curvature tensor takes the following form for a space of constant curvature Rijkl = K(gik gjl − gil gjk )

(959)

where K is the Riemann scalar which is a constant for the space. From this it is straightforward to show that Rij = −(N − 1) K gij

and

R = −N (N − 1) K

(960)

where Rij is the Ricci tensor and R is the Ricci scalar. When we evaluate the curvature tensor for the spatial part of the Robertson-Walker metric then we obtain K = ak2 . 44

Aleksandr Aleksandrovich Friedmann (1888–1925), Soviet cosmologist, developed mathematical model of expanding Universe 45 (Abb´e) Georges Edouard Lemaˆitre (1894–1966), Belgian astronomer and cosmologist, originator of the ‘big-bang’ theory for the origin of the Universe

218

May 11, 2004 5:16pm

28

COSMOLOGY

• The case k = 0 gives the Euclidean metric for a flat space. • The case k = 1 corresponds to a space of constant positive curvature e.g. a hypersphere. This gives rise to spherical geometry. This space is closed in the sense that it has a finite volume. • The case k = −1 corresponds to a space of constant negative curvature. This gives rise to hyperbolic geometry. This space is open in the sense that it has an infinite volume.

28.9

Scale factor

The scale factor a(t) simply blows up these spaces in a uniform manner so that they expand or contract as a˙ = da dt is positive or negative. This behaves like the radius of the Universe with a dependence on time. The coordinates are comoving and are defined on the particles. Thus the particles are always at rest in this coordinate system. Also t is a commonly measured time sometimes known as the cosmic time. This follows from the requirement of homogeneity. By homogeneity we mean that the totality of observations which any fundamental observer can make on the Universe is the same as that of any other observer. This implies the existence of a common time i.e. an absolute Universe-wide sequence of moments. We have specified the geometry of space-time except for the scale factor a(t). The behaviour of a(t) is determined by the field equations.

28.10

Field equations of General Relativity

The field equations are Gµν = −κ Tµν

(961)

where κ = 8πG is a constant. In terms of the Ricci tensor we have Rµν −

1 2

gµν R = −κ Tµν

(962)

Using R = κ T

Rµν = −κ Tµν −

219

1 2

gµν T

(963)

May 11, 2004 5:16pm

28

COSMOLOGY

28.11

Energy-momentum tensor

We model the matter and energy in the Universe by a perfect fluid. For example, the fluid may be composed of an isotropic, homogeneous dust cloud. We think of the dust particles as the centres of mass of galaxies. A perfect fluid is characterised by two scalar fields, the density ρ and the pressure p. These fields depend on time but not on position. The energy-momentum tensor is Tµν = (ρ + p) vµ vν + p gµν

(964)

where vµ is the 4-velocity. There is a simplification in the energy-momentum tensor because the coordinates are comoving i.e. particles are at rest. A perfect fluid is one that is isotropic in its rest frame. If the field equations are to give a solution that is isotropic then the fluid must be at rest. Therefore v µ = γv (c, v) = (1, 0, 0, 0) since we have chosen c = 1. The covariant tensor is    

(Tµν ) =  The mixed tensor is

ρ 0 0 0 0 p g11 0 0 0 0 p g22 0 0 0 0 p g33



−ρ 0 0 0

T

= Tµµ

  

(Tνµ ) =  giving

0 p 0 0

0 0 p 0

0 0 0 p

    

(965)

    

(966)

(967)

= −ρ + 3p and

Tµν −

1 2

   

gµν T = 

1 2

(ρ + 3p) 0 0 0

1 2

0 g11 (ρ − p) 0 0

220

1 2

0 0 g22 (ρ − p) 0

1 2

0 0 0 g33 (ρ − p)

    

(968)

May 11, 2004 5:16pm

28

COSMOLOGY

28.12

Evaluation of the Ricci tensor

For the Robertson-Walker metric the non-zero Christoffel symbols are Γ011 =

aa˙ 1 − kr 2

Γ022 = aa˙ r 2

Γ033 = aa˙ r 2 sin2 θ

Γ101 = Γ110 = Γ202 = Γ220 = Γ303 = Γ330 = Γ122 = −r (1 − kr 2 )

(969)

a˙ a

Γ133 = −r (1 − kr 2 ) sin2 θ

Γ212 = Γ221 = Γ313 = Γ331 = Γ233 = − sin θ cos θ

1 r

Γ323 = Γ332 = cot θ

The non-zero components of the Ricci tensor are 3¨ a a a¨ a + 2a˙ 2 + 2k = − 1 − kr 2 = −(a¨ a + 2a˙ 2 + 2k) r 2

R00 = R11 R22

(970)

R33 = −(a¨ a + 2a˙ 2 + 2k) r 2 sin2 θ The Ricci scalar is then R = Rµµ = g

αµ

(971) Rµα

R00 + g 11 R11 + g 22 R22 + g 33 R33 3 3¨ a a + 2a˙ 2 + 2k) = − − 2 (a¨ a a 6 = − 2 (a¨ a + a˙ 2 + k) a

= g

28.13

00

Conservation of energy-momentum

The energy-momentum tensor obeys ∇µ Tνµ = 0

(972)

∂µ Tνµ + Γµαµ Tνα − Γανµ Tαµ = 0

(973)

This becomes

221

May 11, 2004 5:16pm

28

COSMOLOGY

Let ν = 0, giving the continuity equation for energy, ∂µ T0µ + Γµαµ T0α − Γα0µ Tαµ = 0

(974)

giving, since the tensor is diagonal, ∂0 T00 + Γµ0µ T00 − Γ000 T00 − Γ101 T11 − Γ202 T22 − Γ303 T33 = 0

(975)

Now substitute the Christoffel symbols a˙ a˙ −ρ˙ + −3 ρ − 3 p = 0 a a

(976)

a˙ ρ˙ = −3 (ρ + p) a

(977)

to give

28.14

Equation of state

We need an equation state i.e. a relationship between p and ρ. The cases we will consider are of the form p=ωρ

(978)

where ω is a constant independent of time. Then a˙ ρ˙ = −3 (1 + ω) ρ a

(979)

ρ ∝ a−3(1+w)

(980)

giving

Consider three cases Matter dominated Dust is collisionless, non-relativistic matter having ω = 0. Examples include stars and galaxies in which p is negligible compared to ρ. Then ρ ∝ a−3

(981)

ρ is proportional to the number density, since it is dominated by the rest energy. Radiation dominated In this case we have either electromagnetic radiation or massive particles moving at

222

May 11, 2004 5:16pm

28

COSMOLOGY

close to the speed of light (since they behave like photons). The electromagnetic energy-momentum tensor has zero trace. Therefore T

= −ρ + 3p

(982)

= 0

This gives ρ = 3p or ω = 1/3. Therefore ρ ∝ a−4

(983)

In the case ρ falls off faster with a than in a matter-dominated Universe. Photons lose energy as 1/a as they red-shift. Massive but relativistic particles lose energy as they slow down in comoving coordinates. Vacuum dominated The field equations are modified by the introduction of the cosmological constant Λ. Gµν − Λ gµν = −κ Tµν

(984)

We can bring Λ to the rhs and interpret it as an energy density. Gµν

= −κ Tµν + Λ gµν = −κ (Tµν +

(985)

(vac) Tµν )

where (vac) =− Tµν

Λ gµν κ

(986)

Comparing with Tµν = (ρ + p) vµ vν + p gµν

(987)

gives p = −ρ = −

Λ κ

(988)

so that ω = −1 and ρ is independent of a, a behaviour expected for the vacuum. If the Universe expands forever then a non-zero vacuum will eventually dominate. Today ρmat /ρrad ≈ 106 so that the present Universe is matter dominated. However at earlier times of a ‘big bang’ model the energy density would be dominated by radiation.

28.15

Friedmann equation

The following equations result from the field equations. Taking (µν) = (00) 3¨ a κ = − (ρ + 3p) a 2 223

(989) May 11, 2004 5:16pm

28

COSMOLOGY

and taking (µν) = (11) etc. (they all give the same result due to isotropy) −

1 κ (a¨ a + 2a˙ 2 + 2k) = − (ρ − p) 2 a 2

(990)

The first equation a ¨ κ = − (ρ + 3p) a 6

(991)

is the acceleration equation. In the second equation we can use the acceleration equation to eliminate the second derivative. This gives the Friedmann equation 2

κ k ρ− 2 3 a

(992)

a˙ ρ˙ = −3 (ρ + p) a

(993)

a˙ a

=

which is independent of pressure. These are differential equations in t.

28.16

Summary of formulae

Continuity equation:

Equation of state: ρ ∝ a−3(1+w)

−→

p=ωρ

(994)

Acceleration equation: κ a ¨ = − (ρ + 3p) a 6

(995)

Friedmann equation: 2

a˙ a

28.17

=

k κ ρ− 2 3 a

(996)

Model parameters

The models can be conveniently described by using the following parameters. • The Hubble parameter is given by H= 224

a˙ a

(997) May 11, 2004 5:16pm

28

COSMOLOGY

This measures the rate of expansion. It depends on time. The Friedmann equation can be written in terms of H as k κ ρ− 2 3 a If the present time is t = to then we have H2 =

Ho2 = where

o

(998)

k κ ρo − 2 3 ao

(999)

denotes quantities evaluated at t = t o . Ho is the Hubble constant.

• The critical density is

3H 2 κ

(1000)

ρ ρc

(1001)

1=Ω−

k H 2 a2

(1002)

Ω−1=

k H 2 a2

(1003)

ρc = • The density parameter is Ω= The Friedmann equation gives

or

Therefore Ω determines the geometry ρ < ρc — Ω < 1 — k = −1 — open ρ = ρc — Ω = 1 — k = 0 — flat ρ > ρc — Ω > 1 — k = 1 — closed

(1004)

To measure Ω requires a measurement of H o (for ρc ) and a measurement of ρo (the current density). The latter measurement is particularly difficult. • The deceleration parameter is given by

aa ¨ 2 a˙ This measures the rate of change of the rate of expansion. q=−

(1005)

q can be related to Ω. aa ¨ 2 a˙ a ¨ − 2 H a κ (ρ + 3p) 6H 2 κ (1 + 3ω) ρ 6H 2 1 + 3ω Ω 2

q = − = = = =

225

(1006)

May 11, 2004 5:16pm

28

COSMOLOGY

Therefore if we know ω (what the Universe is made of) then we can determine Ω by measuring q. Unfortunately, we are not completely confident that we know ω, and q is itself hard to measure. The deceleration parameter appears as a second order term in Hubble’s law h

dL = Ho−1 z +

1 2

(1 − qo ) z 2 + · · ·

where dL is the luminosity distance.

28.18

i

(1007)

‘Big bang’ model

Consider the case of ρ > 0 (positive energy density) and p ≥ 0 (non-negative pressure). Then the acceleration equation gives a ¨<0

(1008)

From observation we know that a˙ > 0 i.e. the Universe is expanding. Therefore the expansion of the Universe is decelerating, which we understand as the gravitational attraction opposing the acceleration. This deceleration implies that in the past there was faster acceleration and going back in time there is an initial time at which a = 0, a singularity. This is the ‘big bang’ model. If we assume that a ¨ = 0 then the solution is a = αt where α is a constant and the Hubble parameter is H=

1 a˙ = a t

(1009)

giving to = 1/Ho as an estimate of the age of the Universe. In fact since a ¨ < 0 the age is less than this. The future evolution of the Universe depends on k. For open and flat cases (k ≤ 0) the Friedmann equation gives a˙ 2 =

κ 2 ρa + |k| 3

(1010)

With the assumption that ρ > 0, the rhs is strictly positive so that a˙ never passes through zero. Since we know that today a˙ > 0 it follows that a˙ > 0 for all time. Thus the open and flat models expand forever. Next consider d (ρa3 ) = ρa ˙ 3 + 3ρa2 a˙ dt = −3aa ˙ 2 (ρ + p) + 3ρa2 a˙

(1011)

= −3aa ˙ 2p

226

May 11, 2004 5:16pm

28

COSMOLOGY

using the continuity equation. Since p ≥ 0 we have d (ρa3 ) ≤ 0 dt

(1012)

This implies that ρa2 must go to zero as a −→ ∞. Therefore a˙ 2 −→ |k|

(1013)

so that for a flat model a˙ −→ 0, zero expansion, while for the open model a˙ −→ 1, a constant expansion. For the closed model (k = 1) we have a˙ 2 =

κ 2 ρa − 1 3

(1014)

ρa2 must still go to zero as a −→ ∞. However this would give a˙ 2 < 0 which is not allowed. Therefore there is a maximum expansion a max . From the acceleration equation we know that a ¨ < 0 so the model will start to contract, continuing to zero, the ‘big-crunch’.

28.19

Matter-dominated model

For a matter-dominated model the pressure is zero and the continuity equation gives ρ a3 = constant

(1015)

Using this the Friedmann equation becomes a˙ 2 = = where A =

κ 3

κ ρ a2 − k 3 A −k a

(1016)

ρ a3 is a constant.

The possible models are k=0 This is the flat model and choosing a(0) = 0 the solution of the Friedmann equation is a(t) =

9A 4

1 3

2

t3

(1017)

This is the Einstein-de Sitter model. As t → ∞ we have a˙ → 0 and we obtain a static model. Also ρo = ρc and qo = 12 .

227

May 11, 2004 5:16pm

28

COSMOLOGY

k=1 This is the closed model and choosing a(0) = 0 the solution of the Friedmann equation is a(t) =

A (1 − cos ψ) 2

t=

A (ψ − sin ψ) 2

(1018)

which is the parametric equation for a cycloid. Also ρ o > ρc and qo > 12 . The large density produces a gravitational attraction which eventually overcomes and reverses the expansion. k = −1 This is the open model and choosing a(0) = 0 the solution of the Friedmann equation is a(t) =

A (cosh ψ − 1) 2

t=

A (sinh ψ − ψ) 2

(1019)

Here ρo < ρc and qo < 12 . The low density produces a gravitational attraction which fails to overcome the expansion. Eventually there is a constant expansion with t → ∞ giving a˙ → 1. to is the age of the Universe ( i.e. the time since a = 0). It can be shown that t o < H1o in all models so that there is an upper limit of about 15 Gyr on the age of the Universe. 2

• When k = 0 (flat model) we have a = α t 3 where α is a constant. Then a 3 1 = = t H a˙ 2

(1020)

so that to = 23 H1o or 10 Gyr which is the estimated age of the Galaxy. This inconsistency could be solved if the Hubble constant was decreased. • When k = 1 (closed model) we have to <

2 1 3 Ho

and the time-scale difficulty is increased.

• When k = −1 (open model) we have to → H1o as the present density ρo is lowered. The effects of gravity are reduced and a ∝ t is valid from an earlier starting time. This is the favoured model. The observed density of material is much less than the critical density. The flat model predicts that ρc = ρo and this has led to speculation that there is a substantial amount of unobserved material in the Universe. The closed model predicts a greater density than ρ c and this worsens the missing matter problem. Once again the open model appears to be favoured at present. Overall the cosmological models provide reasonable descriptions of the Universe provided we accept the interpretation of the observations in terms of an expanding ‘big-bang’ theory. There are large uncertainties in the observations and the area of research remains open and very active on both the experimental and theoretical fronts. 228

May 11, 2004 5:16pm

A

3-DIMENSIONAL VECTORS

A A.1

A.2

3-dimensional vectors Cartesian vectors Cartesian basis

{i, j, k} is orthonormal and right-handed.

position vector in Cartesians

r = xi + yj + zk

vector in Cartesians

a = a x i + ay j + az k

length of a vector

a=

unit vector

a=1

q

a2x + a2y + a2z

Vector products dot product

a · b = a x bx + a y by + a z bz a=

√

a·a

a · b = a b cos θ where θ ∈ [0, π] a·b =0 if the non-zero vectors are orthogonal. a·b =b·a (λa) · (µb) = λµ(a · b) a · (b + c) = a · b + a · c i·i=j ·j =k·k =1 i·j =j ·k =k·i=0 cross product

i a × b = ax bx

j k ay az by bz

a × b = a b sin θ n where n is a unit vector and {a, b, n} is right-handed. 229

May 11, 2004 5:16pm

A

3-DIMENSIONAL VECTORS

a×b=0 if the non-zero vectors are parallel/anti-parallel. a × b = −b × a (λa) × (µb) = λµ(a × b) a × (b + c) = a × b + a × c i×i = j ×j = k×k = 0 i × j = k, scalar triple product

j × k = i,

a x a · b × c = bx cx

k×i =j

ay az by bz cy cz

a·b×c =b·c×a = c·a×b

vector triple product

a × (b × c) = (a · c) b − (a · b) c

scalar 4-product

(a × b) · (c × d) = (a · c) (b · d) − (a · d) (b · c)

vector 4-product

A.3

(a × b) × (c × d) = (d · a × b) c − (c · a × b) d = (a · c × d) b − (b · c × d) a

Gradient operator gradient operator

∂ +j ∇ = i ∂x

gradient of a scalar function

∇f =

∂f ∂x

i+

Laplacian operator

∇2 =

∂2 ∂x2

+

∇2 f =

∂2f ∂x2

∂ ∂y

∂ + k ∂z

∂f ∂y

j+

∂2 ∂y 2

+

+

∂2f ∂y 2

∂f ∂z

k

∂2 ∂z 2

+

∂2f ∂z 2

∇2 a = (∇2 ax ) i + (∇2 ay ) j + (∇2 az ) k divergence of a vector field

∇·a=

230

∂ax ∂x

+

∂ay ∂y

+

∂az ∂z

May 11, 2004 5:16pm

A

3-DIMENSIONAL VECTORS

curl of a vector field

an operator

i ∂ ∇ × a = ∂x ax

a · ∇ = ax

231

∂ ∂x

j ∂ ∂y

∂ ∂z

ay

az

+ ay

k

∂ ∂y

+ az

∂ ∂z

May 11, 2004 5:16pm

A

3-DIMENSIONAL VECTORS

A.4

Vector identities involving the gradient ∇ (f g) = f ∇g + g ∇f ∇ × ∇f = 0 ∇ · (∇ × a) = 0 ∇ × (∇ × a) = ∇(∇ · a) − ∇2 a ∇ · (f a) = f ∇ · a + a · ∇f ∇ × (f a) = f ∇ × a + (∇f ) × a ∇ (a · b) = (a · ∇) b + (b · ∇) a + a × (∇ × b) + b × (∇ × a) ∇ · (a × b) = b · (∇ × a) − a · (∇ × b) ∇ × (a × b) = a (∇ · b) − b (∇ · a) + (b · ∇) a − (a · ∇) b ∇·r =3 ∇×r =0 ∇·

∇× ∇2

1 r

r =

1 r

1 r

2 r

r =0 = −4πδ(r)

232

May 11, 2004 5:16pm

A

3-DIMENSIONAL VECTORS

A.5

Vector theorems

Irrotational fields If the curl of a vector field a is zero (∇ × a = 0) then there must exist a scalar field f such that a = ∇f f is unique to within a constant. a is said to be irrotational or conservative. f is the corresponding scalar potential The line integral between two points, P 1 and P2 , is independent of the path C joining them Z

a · dr =

C

Z

P2 P1

∇f · dr

= f (P2 ) − f (P1 )

If C is a closed curve, then I

C

a · dr = 0

Solenoidal fields If the divergence of a vector field a is zero (∇ · a = 0) then there must exist a vector field b such that a=∇×b b is unique to within the gradient of a scalar function. a is said to be solenoidal. b is the corresponding vector potential The divergence theorem V is a volume, with volume element dv. S is the closed surface, with area element dσ, enclosing V . The unit normal n to S points outwards. Then Z

V

∇ · a dv =

Z

∇f dv =

Z

S

a · n dσ

Also Z Z

V

V

∇ × a dv =

f n dσ S

Z

S

n × a dσ

Green’s identities 233

May 11, 2004 5:16pm

A

3-DIMENSIONAL VECTORS

First identity: Z

2

V

(f ∇ g + ∇f · ∇g) dv =

Z

S

f ∇g · n dσ

Second identity: Z

V

(f ∇2 g − g ∇2 f ) dv =

Z

S

(f ∇g − g ∇f ) · n dσ

Stoke’s theorem S is an open surface, with area element dσ, spanning the closed curve C, with line element dr. The unit normal n to S is defined by the right-hand rule in relation to the sense of the line integral around C. Then Z

S

(∇ × a) · n dσ =

I

C

a · dr

Also Z

A.6

S

n × ∇f dσ =

I

f dr C

Curvilinear coordinates

Consider the coordinate transformation from Cartesian coordinates {x 1 , x2 , x3 } to curvilinear coordinates {u1 , u2 , u3 } i.e. x1 = x1 (u1 , u2 , u3 )

x2 = x2 (u1 , u2 , u3 )

x3 = x3 (u1 , u2 , u3 )

The Jacobian is J =

∂x1 ∂u1

∂x1 ∂u2

∂x1 ∂u3

∂x2 ∂u1

∂x2 ∂u2

∂x2 ∂u3

∂x3 ∂u1

∂x3 ∂u2

∂x3 ∂u3

If J does not vanish throughout some region R of space then in R the transformation equations can be solved uniquely for ui in terms of xi . Then a point P with Cartesian coordinates (xP1 , xP2 , xP3 ) corresponds to the point with the corresponding coordinates (u P1 , uP2 , uP3 ). The point P is the intersection of the 3 one-parameter curves r(u 1 , uP2 , uP3 ), r(uP1 , u2 , uP3 ) and r(uP1 , uP2 , u3 ). The unit tangents at the point P are ei =

1 ∂r hi ∂ui

(i = 1, 2, 3)

234

May 11, 2004 5:16pm

A

3-DIMENSIONAL VECTORS

where the scale factors are ∂r ∂u i v u 3 uX ∂xj 2 t

hi = =

∂ui

j=1

Also

3 X ∂r

dr =

i=1 3 X

=

∂ui

dui

hi ei dui

i=1

and 3 X ∂r ∂xj ∂r = ∂ui j=1 ∂ui ∂xj

A.7

and

dui =

3 X ∂ui

j=1

∂xj

dxj

Orthogonal curvilinear coordinates

Orthogonal curvilinear coordinates are those for which the unit tangent vectors are orthogonal. We choose the sense of the unit vectors so that they form a right-handed set. The element of arc length ds is given by (ds)2 = dr · dr

= (dx1 )2 + (dx2 )2 + (dx3 )2 = h21 (du1 )2 + h22 (du2 )2 + h23 (du3 )2

The volume element dv is dv = h1 h2 h3 du1 du2 du3 The surface element dσi on the surface when ui is constant is dσi = hj hk duj duk with i, j, k are all different.

A.8

Cylindrical polar coordinates

The coordinates (ρ, φ, z) are defined as: x = ρ cos φ

y = ρ sin φ 235

z=z May 11, 2004 5:16pm

A

3-DIMENSIONAL VECTORS

z-axis p ppppp pp ppp

(ρ, φ, z)

ρ

........... ............... ............... ............... ............... . . . . . . . . . . . . . . ... ............... ...............

p pppp pppppppppppp p pp p pppppp p p p p p y-axis p pp p pp p pp p p p p p p p into plane p p pp p p p pp p p p p p p p pp ppp pp p p p p p p p p p pp p pp p pp p p p p p p p p p p p p pp .. p pp p p p p ................ p p p pp p p p .. . . . p . . . . p . pp ............... p p pp p p .............................................................................. φ p p p p p . .... pppppppppppppp .. pp..p..p.p.......................... p pp p pp p p p

.. .............

z

x-axis

0 ≤ φ < 2π

Figure 21: Cylindrical polar coordinates with 0 ≤ φ < 2π. See Figure 21. The scale factors are: h1 = 1

h2 = ρ

h3 = 1

The unit vectors are: e1 = cos φ i + sin φ j e2 = − sin φ i + cos φ j

e3 = k Write a vector as:

a = a 1 e 1 + a 2 e2 + a 3 e 3

∇f

=

∇2 f

=

1 ∂f ∂f ∂f e1 + e2 + e3 ∂ρ ρ ∂φ ∂z ∂2f ∂f 1 ∂2f 1 ∂ + ρ + 2 ρ ∂ρ ∂ρ ρ ∂φ2 ∂z 2 236

May 11, 2004 5:16pm

A

3-DIMENSIONAL VECTORS 1 ∂a2 ∂a3 1 ∂ (ρa1 ) + + ρ ∂ρ ρ ∂φ ∂z 1 ∂a3 ∂a2 ∂a1 ∂a3 ∇×a = e1 + e2 − − ρ ∂φ ∂z ∂z ∂ρ 1 ∂ ∂a1 + (ρa2 ) − e3 ρ ∂ρ ∂φ ∇·a =

A.9

Spherical polar coordinates

The coordinates (r, θ, φ) are defined as: x = r sin θ cos φ

y = r sin θ sin φ

z = r cos θ

with 0 ≤ θ ≤ π and 0 ≤ φ < 2π. See Figure 22.

z-axis p pppppp ppp pp

(r, θ, φ)

. ... ... ... ... . . .. ... ... ... ... . . .. ... ... ... ... . . ... ... ... ... ... . . ... ... .. ... . . .. ... ... ... ....................... ... ..... . . . ........... ..... .......... ... ... . . .. .. ... ................ ... ... ... .. . . . . . . . . . . . . . . . ........... ... ............... ........................ ... ............... .......... . . . ... . . . . . . . . . . . . . . ... .... ... ................ .. ................

pp ppppppppppppppppppppp p p p p p y-axis pp p p p p p pp pp r p p p into plane p p pp p p pp p p p p p p p p p p pp p p pp p p p p p p θ p p p pp p p p pp p p p p p p pp p pp p pp p p p p p p p p p pp pp p p pp p p p p p p p p pp p pp ppp φ ppppppppppppppp pp p p p p p p pp p pp p pp

x-axis

0≤θ≤π 0 ≤ φ < 2π

Figure 22: Spherical polar coordinates The scale factors are: h1 = 1

h2 = r

h3 = r sin θ

The unit vectors are: e1 = sin θ cos φ i + sin θ sin φ j + cos θ k e2 = cos θ cos φ i + cos θ sin φ j − sin θ k

e3 = − sin φ i + cos φ j 237

May 11, 2004 5:16pm

A

3-DIMENSIONAL VECTORS

Write a vector as: a = a 1 e 1 + a 2 e2 + a 3 e 3

1 ∂f 1 ∂f ∂f e1 + e2 + e3 ∂r r ∂θ r sin θ ∂φ ∂ ∂2f 1 ∂ 1 ∂f 1 2 2 ∂f ∇ f = r + sin θ + r 2 ∂r ∂r r 2 sin θ ∂θ ∂θ r 2 sin2 θ ∂φ2 1 ∂ 1 ∂a3 1 ∂ r 2 a1 + ∇·a = (sin θ a2 ) + 2 r ∂r r sin θ ∂θ r sin θ ∂φ ∂ 1 ∂a2 e1 ∇×a = (sin θ a3 ) − r sin θ ∂θ ∂φ 1 ∂a1 1 ∂ − (ra3 ) e2 + r sin θ ∂φ r ∂r 1 ∂ ∂a1 + e3 (ra2 ) − r ∂r ∂θ ∇f

=

238

May 11, 2004 5:16pm

B

B

ODDS AND ENDS

Odds and ends

Note 1 Rene Descartes (1596-1650) was a French philosopher and mathematician. With independent means he was able to travel and work on philosophy and science. He spent some time as a soldier. His great contribution to mathematics was the creation of analytical or coordinate geometry. This allowed geometry problems to be solved by algebra and vice versa. It was one of the most important steps ever taken in the progress of the exact sciences. Note 2 Georg Friedrich Bernhard Riemann (1826-66) was a student of Karl Gauss and Wilhelm Weber. He succeeded Dirichlet as a professor of mathematics at G¨ottingen in 1859. He died of tuberculosis. His important contributions to mathematics include Riemann surfaces in complex analysis, the Riemann zeta function and the Riemann integral. His doctoral thesis, delivered in 1854, was on geometry and generalised the work of Gauss (1827) on 2-dimensional surfaces to N -dimensions. Riemann foresaw its importance for physics and laid the mathematical foundations that led to Einstein’s general theory of relativity (1915). Note 3 Lord Kelvin (1824–1907) was born William Thomson in Belfast of Scottish ancestry. His father, James Thomson, was professor of mathematics at the Belfast Academical Institution from 1815 till 1832. Then he was appointed professor of mathematics at Glasgow University. William moved with his family to Glasgow when he was 8. At the age of 10 he matriculated at the University. He was a child prodigy and had a brilliant career in pure and applied mathematics at both Glasgow and Cambridge. While at Cambridge he was influenced by George Green and Robert Murphy on the theoretical side and by Michael Faraday on the practical aspects of electricity. His interests lay in the theories of electricity and gravitation, and potential theory. In 1846 he returned to Glasgow as Professor of Natural Philosophy. Among a wide range of interests were heat, fluid mechanics and the age of the earth. Here is a rather damning quote on vectors: Quaternions came from Hamilton after his really good work had been done; and though beautifully ingenious, have been an unmixed evil to those who have touched them in any way · · · Vector is a useless survival, or offshoot from quaternions, and has never been of the slightest use to any creature. taken from ‘Mathematical Thought from Ancient to Modern Times’ by Morris Kline, OUP 1972 Kelvin was President of the Royal Society from 1890 till 1895. The following quotes, taken from ‘Return of Heroic Failures’ by Stephen Pile 1988, date from that time: Radio has no future. Heavier than air flying machines are impossible. X-rays will prove to be a hoax. 239

May 11, 2004 5:16pm

B

ODDS AND ENDS

which only goes to prove that science is a young person’s game. Note 4 Elie Joseph Cartan (1869-1951) was a French mathematician who made important contributions to algebra and geometry. He added greatly to the theory of continuous groups started by Sophus Lie (pronounced Lee). His work included differential geometry, topology and relativity. Note 5 Leopold Kronecker (1823-1891) was a wealthy businessman who at the same time was an active and highly respected mathematician. In 1883 he became professor at the University of Berlin. Kronecker accepted only mathematical proofs that could be constructed in a finite number of steps. He believed that the natural numbers 1,2,3,. . . were the basis of all mathematics and he opposed even negative numbers. A bitter dispute arose with Georg Cantor (1845-1918) whose work on set theory and trans-finite arithmetic was anathema to Kronecker. This dispute contributed at least partly to Cantor’s mental breakdowns. Note 6 This name was given by Sylvester in honour of Jacobi’s work on algebra and elimination theory. Jacobi greatly developed the theory of determinants and established its usefulness. Carl Gustav Jacob Jacobi (1804-1851) was the son of a Berlin banker. He studied in Berlin and taught as a young professor at Konigsberg from 1826 to 1843. He spent some time in Italy due to ill-health and died while a professor at the University of Berlin. He was a witty and liberal thinker who made important contributions to the theory of elliptic functions and in dynamics (Hamilton-Jacobi equation). Note 7 James Joseph Sylvester (1814-1897) worked with Cayley and together they began the theory of algebraic invariants. Sylvester was a wit, poet and creator of many new terms in mathematics. Besides Jacobian he coined the terms invariant, covariant, contravariant, cogredient and syzygy. He taught at Woolwich Military Academy and visited America twice. Two important contributions which he made to algebra were the theory of elementary divisors and the law of inertia of quadratic forms. Note 8 Elwin Bruno Christoffel (1829-1900) was a German mathematician. He was a professor at Zurich and later at Strasbourg. He developed Riemannian geometry following the publication of Riemann’s 1854 essay in 1868. A key idea due to Christoffel was the notion of covariant differentiation. Note 9 The idea of parallel displacement is due to the Italian mathematician Tullio LeviCivita (1873-1941). He is best known for his work on the absolute differential calculus with its application to the theory of relativity. In 1900 he published, jointly with Ricci, the theory of tensors in a form that was used by Einstein in his theory of relativity 15 years later. Note 10 Gregorio Ricci-Curbastro (1853-1925) was professor of mathematical physics at Padua. Contact with German mathematicians and especially a paper by Christoffel influenced his research and together with his student Levi-Civita he developed tensor calculus, as we know and love it.

240

May 11, 2004 5:16pm

B

ODDS AND ENDS

Note 11 Luigi Bianchi (1856-1928) was professor of analytic geometry at Pisa. He was the first to use the term differential geometry to describe the study of the properties of curves and surfaces which vary from point to point. He influenced the work of Ricci. Note 12 The operator ∇ was originally introduced by Hamilton. It is known as del or grad but Hamilton called it nabla because of its resemblance to an ancient Hebrew musical instrument of that name. Note 13 Newton, Sir Isaac born Dec. 25, 1642 [Jan. 4, 1643, New Style], Woolsthorpe, Lincolnshire, England died March 20 [March 31], 1727, London English physicist and mathematician, who was the culminating figure of the scientific revolution of the 17th century. In optics, his discovery of the composition of white light integrated the phenomena of colours into the science of light and laid the foundation for modern physical optics. In mechanics, his three laws of motion, the basic principles of modern physics, resulted in the formulation of the law of universal gravitation. In mathematics, he was the original discoverer of the infinitesimal calculus. Newton’s Philosophiae Naturalis Principia Mathematica (Mathematical Principles of Natural Philosophy), 1687, was one of the most important single works in the history of modern science. www.britannica.com, April 2001 Note 14 Lorentz, Hendrik Antoon born July 18, 1853, Arnhem, Netherlands died Feb. 4, 1928, Haarlem Dutch physicist and joint winner (with Pieter Zeeman) of the Nobel Prize for Physics in 1902 for his theory of electromagnetic radiation, which, confirmed by findings of Zeeman, gave rise to Albert Einstein’s special theory of relativity. In his doctoral thesis at the University of Leiden (1875), Lorentz refined the electromagnetic theory of James C. Maxwell of England so that it more satisfactorily explained the reflection and refraction of light. He was appointed professor of mathematical physics at Leiden in 1878. His work in physics was wide in scope, but his central aim was to construct a single theory to explain the relationship of electricity, magnetism, and light. Although, according to Maxwell’s theory, electromagnetic radiation is produced by the oscillation of electric charges, the charges that produce light were unknown. Since it was generally believed that an electric current was made up of charged particles, Lorentz later theorised that the atoms of matter might also consist of charged particles and suggested that the oscillations of these charged particles (electrons) inside the atom were the source of light. If this were true, then a strong magnetic field ought to have an effect on the oscillations and therefore on the wavelength of the light thus produced. In 1896 Zeeman, a pupil of Lorentz, demonstrated this phenomenon, known as the Zeeman effect, and in 1902 they were awarded the Nobel Prize. Lorentz’ electron theory was not, however, successful in explaining the negative results of the Michelson-Morley experiment, an effort to measure the velocity of the Earth 241

May 11, 2004 5:16pm

B

ODDS AND ENDS

through the hypothetical luminiferous ether by comparing the velocities of light from different directions. In an attempt to overcome this difficulty he introduced in 1895 the idea of local time (different time rates in different locations). Lorentz arrived at the notion that moving bodies approaching the velocity of light contract in the direction of motion. The Irish physicist George Francis Fitzgerald had already arrived at this notion independently (the Lorentz-Fitzgerald contraction), and in 1904 Lorentz extended his work and developed the Lorentz transformations. These mathematical formulas describe the increase of mass, shortening of length, and dilation of time that are characteristic of a moving body and form the basis of Einstein’s special theory of relativity. In 1912 Lorentz became director of research at the Teyler Institute, Haarlem, though he remained honorary professor at Leiden, where he gave weekly lectures. www.britannica.com, April 2001 Note 15 Minkowski, Hermann born June 22, 1864, Aleksotas, Russian Empire [now in Kaunas, Lithuania] died Jan. 12, 1909, Gottingen, Germany. German mathematician who developed the geometrical theory of numbers and who used geometrical methods to solve difficult problems in number theory, mathematical physics, and the theory of relativity. His idea of a four-dimensional space (since known as ‘Minkowski space’), combining the three dimensions of physical space with that of time, laid the mathematical foundation of Albert Einstein’s general theory of relativity. The son of German parents living in Russia (his brother was Oskar Minkowski who did important work on diabetes), he returned to Germany with his parents in 1872 and spent his youth in the royal Prussian city of Konigsberg, from whose university he received a doctorate in 1885. He taught mathematics at Bonn (1885-94), Konigsberg (1894-96), Zurich (1896-1902), and the University of Gottingen (1902-09). His major work was Raum und Zeit (1907; ‘Space and Time’). www.britannica.com, April 2001 Note 16 Broglie, Louis-Victor, 7e duc de, born Aug. 15, 1892, Dieppe, France died March 19, 1987, Paris in full Louis-Victor-Pierre-Raymond, 7e duc de Broglie French physicist best known for his research on quantum theory and for his discovery of the wave nature of electrons. He was awarded the 1929 Nobel Prize for Physics. www.britannica.com, April 2001 Note 17 George Francis Fitzgerald (1851-1901) was professor of Natural Philosophy at Trinity College Dublin. He was the son of William Fitzgerald, Bishop of Cork. His idea to explain the Michelson-Morley experiment was that the motion of bodies relative to the ether produced a shortening in length of the particles of which matter is composed. Lorentz later examined the idea mathematically. 242

May 11, 2004 5:16pm

B

ODDS AND ENDS

Note 18 Ernst Mach (1838-1916) was an Austrian theoretical physicist. He fundamentally reappraised the philosophy of science and is known as the ‘father of logical positivism’. Mach was professor of mathematics at Graz (1864), followed by a chair in experimental physics in Prague (1867). In 1895 he moved to a chair in philosophy at Vienna. He retired in 1901 after a slight stroke produced partial paralysis. He believed that science contained abstract and untestable models and concepts, and that science should discard anything that was not observable. Mach disliked Einstein’s theory of relativity even though it was based on his own ideas of the origin of inertia. I can accept the theory of relativity as little as I can accept the existence of atoms and other dogmas. from ‘The Book of Heroic Failures’ by Stephen Pile, 1979. Mach did some experimental work and published photographs of projectiles in flight together with their accompanying shock waves. The Mach number is the ratio of projectile speed to the speed of sound in the same medium. At Mach 1, speed is sonic. Subsonic is less than Mach 1 while supersonic is above Mach 1. Note 19 Baron Roland von E¨otv¨os (1848-1919) was a Hungarian physicist who introduced the concept of molecular surface tension. (E¨otv¨os is pronounced something like ootvoosh with oo as in foot.) His use of a torsion balance to test the Equivalence Principle to such high accuracy was a great achievement given the technology available. Note 20 Robert Dicke (1916-1997) was an American physicist noted for his work in general relativity. Most of his career was spent at Princeton. In the 1940s he contributed to research in microwave physics including radar and spectroscopy. Later his interest in gravitation grew and he carried out a series of experiments on the Equivalence Principle using the E¨otv¨os torsion balance with improved accuracy. With Carl Brans, he developed a theory of gravity in which the gravitational constant varies with time following an idea due to Dirac. In 1964 Dicke and others suggested the existence of a universal background microwave radiation, a residual effect from the initial big-bang. He was unaware that Gamow, Alpher and Herman had predicted this 16 years earlier. Just before Dicke searched for this radiation, it was discovered fortuitously by Penzias and Wilson. This reminds me of the famous paper by Alpher, Bethe and Gamow — an excellent physics joke. Note 21 Karl Schwarzschild (1873-1916) published the solution in January 1916 just about two months after Einstein had published the basic equations of General Relativity . Einstein’s comment was: ‘I have read your paper with the greatest interest. I had not expected that one could formulate the exact solution of the problem so simply. The analytical treatment of the problem appears to me splendid.’. During the spring and summer of 1915, Schwarzschild was serving in the German army at the eastern front. While at the eastern front with a small technical staff, Schwarzschild contracted pemphigus — an unpleasant and then fatal skin disease; and he died on 11 May

243

May 11, 2004 5:16pm

B

ODDS AND ENDS

1916. It was during this period of illness that Schwarzschild wrote his two papers on relativity, and a fundamental one on the Bohr-Sommerfeld theory. Note 22 Einstein had predicted as early as 1911 that light from a star passing close to the sun would be deflected by its mass. The General Relativity theory of 1915 predicted the size of the deflection. In order to detect this it was necessary to view the stars near the sun at a solar eclipse. Due to the First World War the first opportunity was in 1919 and two expeditions were organised by Arthur Eddington, one to Brazil and the other to Principle Island off West Central Africa. The results were unambiguous in confirming Einstein’s predictions. On November 7 1919 the London Times announced ‘Revolution in Science / New Theory of the Universe / Newtonian Ideas Overthrown’. Overnight Einstein was a public celebrity. Eddington’s book on relativity ‘The Mathematical Theory of Relativity’ was praised by Einstein as the best exposition of his theory. Note 23 Compton, Arthur Holly born Sept. 10, 1892, Wooster, Ohio, U.S. died March 15, 1962, Berkeley, Calif. American physicist and joint winner, with C.T.R. Wilson of England, of the Nobel Prize for Physics in 1927 for his discovery and explanation of the change in the wavelength of X rays when they collide with electrons in metals. This so-called Compton effect is caused by the transfer of energy from a photon to an electron. Its discovery in 1922 confirmed the dual nature of electromagnetic radiation as both a wave and a particle. Compton, a younger brother of the physicist Karl T. Compton, received his doctorate from Princeton University in 1916 and became head of the department of physics at Washington University, St. Louis, in 1920. Compton’s Nobel Prize winning research focused on the strange phenomena that occur when beams of short-wavelength X rays are aimed at elements of low atomic weight. He discovered that some of the X rays scattered by the elements are of longer wavelength than they were before being scattered. This result is contrary to the laws of classical physics, which could not explain why the scattering of a wave should increase its wavelength. Compton initially theorised that the size and shape of electrons in the target atoms could account for the change in the X rays’ wavelength. In 1922, however, he concluded that Einstein’s quantum theory, which argued that light consists of particles rather than waves, offered a better explanation of the effect. In his new model, Compton interpreted X rays as consisting of particles, or photons, as he called them. He argued that an X-ray photon can collide with an electron of a carbon atom; when this happens, the photon transfers some of its energy to the electron and then continues on with diminished energy and a longer wavelength than it had before. Compton’s interpretation provided the first widely accepted experimental evidence that electromagnetic radiation can exhibit both particle and wave behaviour, and thus helped to establish the legitimacy of the still-radical quantum theory. From 1923 to 1945 Compton was a professor of physics at the University of Chicago. In 1941 he was chairman of the committee of the National Academy of Sciences that 244

May 11, 2004 5:16pm

B

ODDS AND ENDS

studied the military potential of atomic energy. In this capacity he was instrumental, with the physicist Ernest O. Lawrence, in initiating the Manhattan Project which created the first atomic bomb. From 1942 to 1945 he was director of the Metallurgical Laboratory at the University of Chicago, which developed the first self-sustaining atomic chain reaction and paved the way for controlled release of nuclear energy. He became chancellor of Washington University in 1945 and was professor of natural history there from 1953 until 1961. www.britannica.com, April 2001

245

May 11, 2004 5:16pm

C

C

BIOGRAPHY OF MAXWELL

Biography of Maxwell

James Clerk Maxwell is considered one of the greatest physicists. In a millennium poll conducted amongst its members by the Institute of Physics, Maxwell was rated third (67 votes), beaten by Newton (96 votes) and Einstein (119 votes). While one shouldn’t attach too much significance to such a beauty contest it does indicate the high regard for his work. Richard Feynman, who liked a good joke, described Maxwell’s equations for electromagnetism as ‘the most significant event of the 19th century’. This seems exaggerated but he was taking a very long-term point of view, probably stardate 25673.5. Maxwell’s main scientific achievements were: • Unification of electricity and magnetism through the Maxwell equations. He showed that light is electromagnetic, in fact the visible part of a broad spectrum ranging from radio waves, microwaves, through to X-rays and γ-rays. • Dynamical theory of gases using a statistical approach. Boltzmann contributed significantly to this area also. • Foundation of Cavendish Laboratory at Cambridge. This initiated a change in the way that science was done. His successors as Cavendish professor were Rayleigh, Thomson and Rutherford — an impressive list. Maxwell wrote two important texts: ‘Theory of Heat’ (1870) and ‘Treatise on electricity and magnetism’ (1873). He made important contributions to vector analysis as this was essential to his treatment of electromagnetism. Maxwell had a strong sense of humour and conservative religious beliefs. During his lifetime Maxwell was recognised as an exceptional scientist but did not achieve his present fame. Full recognition came gradually after his death, associated partly with the technology that arose from his work on electromagnetism. He is buried at Parton, overlooking Lough Ken. This is on the A713 which runs north of Castle Douglas towards New Galloway. Rather than rush along the A75 from Stranraer to Carlisle why not take an hour’s diversion to visit his grave.

Maxwell, James Clerk b. June 13, 1831, Edinburgh, Scotland d. November 5, 1879, Cambridge, Cambridgeshire, England

Scottish physicist best known for his formulation of electromagnetic theory. He is regarded by most modern physicists as the scientist of the 19th century who had the greatest influence on 20th-century physics, and he is ranked with Sir Isaac Newton and Albert Einstein for the 246

May 11, 2004 5:16pm

C

BIOGRAPHY OF MAXWELL

fundamental nature of his contributions. In 1931, on the 100th anniversary of Maxwell’s birth, Einstein described the change in the conception of reality in physics that resulted from Maxwell’s work as ”the most profound and the most fruitful that physics has experienced since the time of Newton.”

The concept of electromagnetic radiation originated with Maxwell, and his field equations, based on Michael Faraday’s observations of the electric and magnetic lines of force, paved the way for Einstein’s special theory of relativity, which established the equivalence of mass and energy. Maxwell’s ideas also ushered in the other major innovation of 20th-century physics, the quantum theory. His description of electromagnetic radiation led to the development (according to classical theory) of the ultimately unsatisfactory law of heat radiation, which prompted Max Planck’s formulation of the quantum hypothesis – i.e., the theory that radiant-heat energy is emitted only in finite amounts, or quanta. The interaction between electromagnetic radiation and matter, integral to Planck’s hypothesis, in turn has played a central role in the development of the theory of the structure of atoms and molecules. Early life Maxwell came from a comfortable middle-class background. The original family name was Clerk, the additional surname being added by his father, who was a lawyer, after he had inherited the Middlebie estate from Maxwell ancestors. James was an only child. His parents had married late in life, and his mother was 40 years old at his birth. Shortly afterward the family moved from Edinburgh to Glenlair, the country house on the Middlebie estate. His mother died in 1839 from abdominal cancer, the very disease to which Maxwell was to succumb at exactly the same age. A dull and uninspired tutor was engaged who claimed that James was slow at learning, though in fact he displayed a lively curiosity at an early age and had a phenomenal memory. Fortunately he was rescued by his aunt Jane Cay and from 1841 was sent to school at the Edinburgh Academy. Among the other pupils were his biographer Lewis Campbell and his friend Peter Guthrie Tait. Maxwell’s interests ranged far beyond the school syllabus, and he did not pay particular attention to examination performance. His first scientific paper, published when he was only 14 years old, described a generalised series of oval curves that could be traced with pins and thread by analogy with an ellipse. This fascination with geometry and with mechanical models continued throughout his career and was of great help in his subsequent research. At the age of 16 he entered the University of Edinburgh, where he read voraciously on all subjects and published two more scientific papers. In 1850 he went to the University of Cambridge, where his exceptional powers began to be recognised. His mathematics teacher, William Hopkins, was a well-known ”wrangler maker” (a wrangler is one who takes first class honours in the mathematics examinations at Cambridge) whose students included Tait, George Gabriel (later Sir George) Stokes, William Thomson (later Lord Kelvin), Arthur Cayley, and Edward John Routh. Of Maxwell, Hopkins is reported to have said that he was the most extraordinary man he had met with in the whole course of his experience, that it seemed impossible for him to think wrongly on any physical subject, but that in analysis he was far more deficient. (Other contemporaries also testified to Maxwell’s preference 247

May 11, 2004 5:16pm

C

BIOGRAPHY OF MAXWELL

for geometrical over analytical methods.) This shrewd assessment was later borne out by several important formulas advanced by Maxwell that obtained correct results from faulty mathematical arguments. In 1854 Maxwell was second wrangler and first Smith’s prizeman (the Smith’s prize is a prestigious competitive award for an essay that incorporates original research). He was elected to a fellowship at Trinity, but, because his father’s health was deteriorating, he wished to return to Scotland. In 1856 he was appointed to the professorship of natural philosophy at Marischal College, Aberdeen, but before the appointment was announced his father died. This was a great personal loss, for Maxwell had had a close relationship with his father. In June 1858 Maxwell married Katherine Mary Dewar, daughter of the principal of Marischal College. The union was childless and was described by his biographer as a ”married life . . . of unexampled devotion.” In 1860 the University of Aberdeen was formed by a merger between King’s College and Marischal College, and Maxwell was declared redundant. He applied for a vacancy at the University of Edinburgh, but he was turned down in favour of his school friend Tait. He then was appointed to the professorship of natural philosophy at King’s College, London. The next five years were undoubtedly the most fruitful of his career. During this period his two classic papers on the electromagnetic field were published, and his demonstration of colour photography took place. He was elected to the Royal Society in 1861. His theoretical and experimental work on the viscosity of gases also was undertaken during these years and culminated in a lecture to the Royal Society in 1866. He supervised the experimental determination of electrical units for the British Association for the Advancement of Science, and this work in measurement and standardisation led to the establishment of the National Physical Laboratory. He also measured the ratio of electromagnetic and electrostatic units of electricity and confirmed that it was in satisfactory agreement with the velocity of light as predicted by his theory. Later life In 1865 he resigned his professorship at King’s College and retired to the family estate in Glenlair. He continued to visit London every spring and served as external examiner for the Mathematical Tripos (exams) at Cambridge. In the spring and early summer of 1867 he toured Italy. But most of his energy during this period was devoted to writing his famous treatise on electricity and magnetism. It was Maxwell’s research on electromagnetism that established him among the great scientists of history. In the preface to his Treatise on Electricity and Magnetism (1873), the best exposition of his theory, Maxwell stated that his major task was to convert Faraday’s physical ideas into mathematical form. In attempting to illustrate Faraday’s law of induction (that a changing magnetic field gives rise to an induced electromagnetic field), Maxwell constructed a mechanical model. He found that the model gave rise to a corresponding ”displacement current” in the dielectric medium, which could then be the seat of transverse waves. On calculating the velocity of these waves, he found that they were very close to the velocity of light. Maxwell concluded that he could ”scarcely avoid the inference that light consists in the transverse undulations of the same medium which is the cause of electric and magnetic phenomena.” 248

May 11, 2004 5:16pm

C

BIOGRAPHY OF MAXWELL

Maxwell’s theory suggested that electromagnetic waves could be generated in a laboratory, a possibility first demonstrated by Heinrich Hertz in 1887, eight years after Maxwell’s death. The resulting radio industry with its many applications thus has its origin in Maxwell’s publications. In addition to his electromagnetic theory, Maxwell made major contributions to other areas of physics. While still in his 20s, Maxwell demonstrated his mastery of classical physics by writing a prizewinning essay on Saturn’s rings, in which he concluded that the rings must consist of masses of matter not mutually coherent – a conclusion that was corroborated more than 100 years later by the first Voyager space probe to reach Saturn. The Maxwell relations of equality between different partial derivatives of thermodynamic functions are included in every standard textbook on thermodynamics (see thermodynamics). Though Maxwell did not originate the modern kinetic theory of gases, he was the first to apply the methods of probability and statistics in describing the properties of an assembly of molecules. Thus he was able to demonstrate that the velocities of molecules in a gas, previously assumed to be equal, must follow a statistical distribution (known subsequently as the Maxwell-Boltzmann distribution law). In later papers Maxwell investigated the transport properties of gases – i.e., the effect of changes in temperature and pressure on viscosity, thermal conductivity, and diffusion. Maxwell was far from being an abstruse theoretician. He was skillful in the design of experimental apparatus, as was shown early in his career during his investigations of colour vision. He devised a colour top with adjustable sectors of tinted paper to test the threecolour hypothesis of Thomas Young and later invented a colour box that made it possible to conduct experiments with spectral colours rather than pigments. His investigations of the colour theory led him to conclude that a colour photography could be produced by photographing through filters of the three primary colours and then recombining the images. He demonstrated his supposition in a lecture to the Royal Institution of Great Britain in 1861 by projecting through filters a colour photograph of a tartan ribbon that had been taken by this method. In addition to these well-known contributions, a number of ideas that Maxwell put forward quite casually have since led to developments of great significance. The hypothetical intelligent being known as Maxwell’s demon was a factor in the development of information theory. Maxwell’s analytic treatment of speed governors is generally regarded as the founding paper on cybernetics, and his ”equal areas” construction provided an essential constituent of the theory of fluids developed by Johannes Diederik van der Waals. His work in geometrical optics led to the discovery of the fish-eye lens. From the start of his career to its finish his papers are filled with novelty and interest. He also was a contributor to the ninth edition of Encyclopædia Britannica. In 1871 Maxwell was elected to the new Cavendish professorship at Cambridge. He set about designing the Cavendish Laboratory and supervised its construction. Maxwell had few students, but they were of the highest calibre and included William D. Niven, Ambrose (later Sir Ambrose) Fleming, Richard Tetley Glazebrook, John Henry Poynting, and Arthur Schuster. During the Easter term of 1879 Maxwell took ill on several occasions; he returned to Glenlair 249

May 11, 2004 5:16pm

C

BIOGRAPHY OF MAXWELL

in June but his condition did not improve. He died on November 5, after a short illness. Maxwell received no public honours and was buried quietly in a small churchyard in the village of Parton, in Scotland. c Copyright 1994-2001 Encyclopædia Britannica, Inc.

250

May 11, 2004 5:16pm

D

D

BIOGRAPHY OF EINSTEIN

Biography of Einstein

Einstein, Albert born March 14, 1879, Ulm, Wurttemberg, Germany died April 18, 1955, Princeton, N.J., U.S. German-American physicist who developed the special and general theories of relativity and won the Nobel Prize for Physics in 1921 for his explanation of the photoelectric effect. Recognised in his own time as one of the most creative intellects in human history, in the first 15 years of the 20th century Einstein advanced a series of theories that proposed entirely new ways of thinking about space, time and gravitation. His theories of relativity and gravitation were a profound advance over the old Newtonian physics and revolutionised scientific and philosophic inquiry. Herein lay the unique drama of Einstein’s life. He was a self-confessed lone traveller; his mind and heart soared with the cosmos, yet he could not armour himself against the intrusion of the often horrendous events of the human community. Almost reluctantly he admitted that he had a ‘passionate sense of social justice and social responsibility’. His celebrity gave him an influential voice that he used to champion such causes as pacifism, liberalism, and Zionism. The irony for this idealistic man was that his famous postulation of an energymass equation, which states that a particle of matter can be converted into an enormous quantity of energy, had its spectacular proof in the creation of the atomic and hydrogen bombs, the most destructive weapons ever known. Early life and career In 1880, the year after Einstein’s birth, his family moved from Ulm to Munich, where Hermann Einstein, his father, and Jakob Einstein, his uncle, set up a small electrical plant and engineering works. In Munich Einstein attended rigidly disciplined schools. Under the harsh and pedantic regimentation of 19th-century German education, which he found intimidating and boring, he showed little scholastic ability. At the behest of his mother, Einstein also studied music; though throughout life he played exclusively for relaxation, he became an accomplished violinist. It was then only Uncle Jakob who stimulated in Einstein a fascination for mathematics and Uncle Casar Koch who stimulated a consuming curiosity about science. By the age of 12 Einstein had decided to devote himself to solving the riddle of the ‘huge world’. Three years later, with poor grades in history, geography, and languages, he left school with no diploma and went to Milan to rejoin his family, who had recently moved there from Germany because of his father’s business setbacks. Albert Einstein resumed his education in Switzerland, culminating in four years of physics and mathematics at the renowned Federal Polytechnic Academy in Zurich. After his graduation in the spring of 1900, he became a Swiss citizen, worked for two months as a mathematics teacher, and then was employed as examiner at the Swiss patent office in Bern. With his newfound security, Einstein married his university sweetheart, Mileva Maric, in 1903. Early in 1905 Einstein published in the prestigious German physics monthly Annalen der Physik a thesis, ‘A New Determination of Molecular Dimensions’, that won him a Ph.D. 251

May 11, 2004 5:16pm

D

BIOGRAPHY OF EINSTEIN

from the University of Zurich. Four more important papers appeared in Annalen that year and forever changed man’s view of the universe. The first of these, ‘On the Motion — Required by the Molecular Kinetic Theory of Heat — of Small Particles Suspended in a Stationary Liquid’, provided a theoretical explanation of Brownian motion. In ‘On a Heuristic Viewpoint Concerning the Production and Transformation of Light’, Einstein postulated that light is composed of individual quanta (later called photons) that, in addition to wavelike behaviour, demonstrate certain properties unique to particles. In a single stroke he thus revolutionised the theory of light and provided an explanation for, among other phenomena, the emission of electrons from some solids when struck by light, called the photoelectric effect. Einstein’s special theory of relativity, first printed in ‘On the Electrodynamics of Moving Bodies’, had its beginnings in an essay Einstein wrote at age 16. The precise influence of work by other physicists on Einstein’s special theory is still controversial. The theory held that if, for all frames of reference, the speed of light is constant and if all natural laws are the same, then both time and motion are found to be relative to the observer. In the mathematical progression of the theory, Einstein published his fourth paper, ‘Does the Inertia of a Body Depend Upon Its Energy Content?’. This mathematical footnote to the special theory of relativity established the equivalence of mass and energy, according to which the energy E of a quantity of matter, with mass m, is equal to the product of the mass and the square of the velocity of light, c. This relationship is commonly expressed in the form E = mc2 . Public understanding of this new theory and acclaim for its creator were still many years off, but Einstein had won a place among Europe’s most eminent physicists, who increasingly sought his counsel, as he did theirs. While Einstein continued to develop his theory, attempting now to encompass with it the phenomenon of gravitation, he left the patent office and returned to teaching — first in Switzerland, briefly at the German University in Prague, where he was awarded a full professorship, and then, in the winter of 1912, back at the Polytechnic in Zurich. He was later remembered from this time as a very happy man, content in his marriage and delighted with his two young sons, Hans Albert and Edward. In April 1914 the family moved to Berlin, where Einstein had accepted a position with the Prussian Academy of Sciences, an arrangement that permitted him to continue his researches with only the occasional diversion of lecturing at the University of Berlin. His wife and two sons vacationed in Switzerland that summer and, with the eruption of World War I, were unable to return to Berlin. A few years later this enforced separation was to lead to divorce. Einstein abhorred the war and was an outspoken critic of German militarism among the generally acquiescent academic community in Berlin, but he was primarily engrossed in perfecting his general theory of relativity, which he published in Annalen der Physik as ‘The Foundation of the General Theory of Relativity’ in 1916. The heart of this postulate was that gravitation is not a force, as Newton had said, but a curved field in the space-time continuum, created by the presence of mass. This notion could be proved or disproved, he suggested, by measuring the deflection of starlight as it travelled close by the Sun, the starlight being visible only during a total eclipse. Einstein predicted twice the light deflection that would be accountable under Newton’s laws. 252

May 11, 2004 5:16pm

D

BIOGRAPHY OF EINSTEIN

His new equations also explained for the first time the puzzling irregularity – that is, the slight advance – in the planet Mercury’s perihelion, and they demonstrated why stars in a strong gravitational field emitted light closer to the red end of the spectrum than those in a weaker field. While Einstein awaited the end of the war and the opportunity for his theory to be tested under eclipse conditions, he became more and more committed to pacifism, even to the extent of distributing pacifist literature to sympathisers in Berlin. His attitudes were greatly influenced by the French pacifist and author Romain Rolland, whom he met on a wartime visit to Switzerland. Rolland’s diary later provided the best glimpse of Einstein’s physical appearance as he reached his middle 30s: Einstein is still a young man, not very tall, with a wide and long face, and a great mane of crispy, frizzled and very black hair, sprinkled with gray and rising high from a lofty brow. His nose is fleshy and prominent, his mouth small, his lips full, his cheeks plump, his chin rounded. He wears a small cropped moustache. (By permission of Madame Marie Romain Rolland.) Einstein’s view of humanity during the war period appears in a letter to his friend, the Austrian-born Dutch physicist Paul Ehrenfest: The ancient Jehovah is still abroad. Alas, he slays the innocent along with the guilty, whom he strikes so fearsomely blind that they can feel no sense of guilt · · · We are dealing with an epidemic delusion which, having caused infinite suffering, will one day vanish and become a monstrous and incomprehensible source of wonderment to later generations. (From Otto Nathan and Heinz Norden [eds.], Einstein on Peace; Simon and Schuster, 1960.) It would be said often of Einstein that he was naive about human affairs; for example, with the proclamation of the German Republic and the armistice in 1918, he was convinced that militarism had been thoroughly abolished in Germany. International acclaim International fame came to Einstein in November 1919, when the Royal Society of London announced that its scientific expedition to the island of Principe, in the Gulf of Guinea, had photographed the solar eclipse on May 29 of that year and completed calculations that verified the predictions made in Einstein’s general theory of relativity. Few could understand relativity, but the basic postulates were so revolutionary and the scientific community was so obviously bedazzled that the physicist was acclaimed the greatest genius on Earth. Einstein himself was amazed at the reaction and apparently displeased, for he resented the consequent interruptions of his work. After his divorce he had, in the summer of 1919, married Elsa, the widowed daughter of his late father’s cousin. He lived quietly with Elsa and her two daughters in Berlin, but, inevitably, his views as a foremost savant were sought on a variety of issues.

253

May 11, 2004 5:16pm

D

BIOGRAPHY OF EINSTEIN

Despite the now deteriorating political situation in Germany, Einstein attacked nationalism and promoted pacifist ideals. With the rising tide of anti-Semitism in Berlin, Einstein was castigated for his ‘Bolshevism in physics’, and the fury against him in right-wing circles grew when he began publicly to support the Zionist movement. Judaism had played little part in his life, but he insisted that, ‘as a snail can shed his shell and still be a snail, so a Jew can shed his faith and still be a Jew’. Although Einstein was regarded warily in Berlin, such was the demand for him in other European cities that he travelled widely to lecture on relativity, usually arriving at each place by third-class rail carriage, with a violin tucked under his arm. So successful were his lectures that one enthusiastic impresario guaranteed him a three-week booking at the London Palladium. He ignored the offer but, at the request of the Zionist leader Chaim Weizmann, toured the United States in the spring of 1921 to raise money for the Palestine Foundation Fund. Frequently treated like a circus freak and feted from morning to night, Einstein nevertheless was gratified by the standards of scientific research and the ‘idealistic attitudes’ that he found prevailing in the United States. During the next three years Einstein was constantly on the move, journeying not only to European capitals but also to Asia, to the Middle East, and to South America. According to his diary notes, he found nobility among the Hindus of Ceylon (now Sri Lanka), a pureness of soul among the Japanese, and a magnificent intellectual and moral calibre among the Jewish settlers in Palestine. His wife later wrote that, on steaming into one new harbour, Einstein had said to her, ‘Let us take it all in before we wake up’. In Shanghai a cable reached him announcing that he had been awarded the 1921 Nobel Prize for Physics ‘for your photoelectric law and your work in the field of theoretical physics’. Relativity, still the centre of controversy, was not mentioned. Though the 1920s were tumultuous times of wide acclaim and some notoriety, Einstein did not waver from his new search — to find the mathematical relationship between electromagnetism and gravitation. This would be a first step, he felt, in discovering the common laws governing the behaviour of everything in the universe, from the electron to the planets. He sought to relate the universal properties of matter and energy in a single equation or formula, in what came to be called a unified field theory. This turned out to be a fruitless quest that occupied the rest of his life. Einstein’s peers generally agreed quite early that his search was destined to fail because the rapidly developing quantum theory uncovered an uncertainty principle in all measurements of the motion of particles: the movement of a single particle simply could not be predicted because of a fundamental uncertainty in measuring simultaneously both its speed and its position, which means, in effect, that the future of any physical system at the subatomic level cannot be predicted. While fully recognising the brilliance of quantum mechanics, Einstein rejected the idea that these theories were absolute and persevered with his theory of general relativity as the more satisfactory foundation to future discovery. He was widely quoted on his belief in an exactly engineered universe: ‘God is subtle but he is not malicious’. On this point, he parted company with most theoretical physicists. The distinguished German quantum theorist Max Born, a close friend of Einstein, said at the time: ‘Many of us regard this as a tragedy, both for him , as he gropes his way in loneliness, and for us, who miss our leader and standard-bearer’. This 254

May 11, 2004 5:16pm

D

BIOGRAPHY OF EINSTEIN

appraisal, and others pronouncing his work in later life as largely wasted effort, will have to await the judgement of later generations. The year of Einstein’s 50th birthday, 1929, marked the beginning of the ebb flow of his life’s work in a number of aspects. Early in the year the Prussian Academy published the first version of his unified field theory, but, despite the sensation it caused, its very preliminary nature soon became apparent. The reception of the theory left him undaunted, but Einstein was dismayed by the preludes to certain disaster in the field of human affairs: Arabs launched savage attacks on Jewish colonists in Palestine; the Nazis gained strength in Germany; the League of Nations proved so impotent that Einstein resigned abruptly from its Committee on Intellectual Cooperation as a protest to its timidity; and the stock market crash in New York City heralded worldwide economic crisis. Crushing Einstein’s natural gaiety more than any of these events was the mental breakdown of his younger son, Edward. Edward had worshipped his father from a distance but now blamed him for deserting him and for ruining his life. Einstein’s sorrow was eased only slightly by the amicable relationship he enjoyed with his older son, Hans Albert. As visiting professor at the University of Oxford in 1931, Einstein spent as much time espousing pacifism as he did discussing science. He went so far as to authorise the establishment of the Einstein War Resisters’ International Fund in order to bring massive public pressure to bear on the World Disarmament Conference, scheduled to meet in Geneva in February 1932. When these talks foundered, Einstein felt that his years of supporting world peace and human understanding had accomplished nothing. Bitterly disappointed, he visited Geneva to focus world attention on the ‘farce’ of the disarmament conference. In a rare moment of fury, Einstein stated to a journalist, They [the politicians and statesmen] have cheated us. They have fooled us. Hundreds of millions of people in Europe and in America, billions of men and women yet to be born, have been and are being cheated, traded and tricked out of their lives and health and well-being. Shortly after this, in a famous exchange of letters with the Austrian psychiatrist Sigmund Freud, Einstein suggested that people must have an innate lust for hatred and destruction. Freud agreed, adding that war was biologically sound because of the love-hate instincts of man and that pacifism was an idiosyncrasy directly related to Einstein’s high degree of cultural development. This exchange was only one of Einstein’s many philosophic dialogues with renowned men of his age. With Rabindranath Tagore, Hindu poet and mystic, he discussed the nature of truth. While Tagore held that truth was realised through man, Einstein maintained that scientific truth must be conceived as a valid truth that is independent of humanity. ‘I cannot prove that I am right in this, but that is my religion’, said Einstein. Firmly denying atheism, Einstein expressed a belief in ‘Spinoza’s God who reveals himself in the harmony of what exists’. The physicist’s breadth of spirit and depth of enthusiasm were always most evident among truly intellectual men. He loved being with the physicists Paul Ehrenfest and Hendrik A. Lorentz at The Netherlands’ Leiden University, and several times he visited the California Institute of Technology in Pasadena to attend seminars at

255

May 11, 2004 5:16pm

D

BIOGRAPHY OF EINSTEIN

the Mt. Wilson Observatory, which had become world renowned as a centre for astrophysical research. At Mt. Wilson he heard the Belgian scientist Abbe Georges Lemaitre detail his theory that the universe had been created by the explosion of a ‘primeval atom‘ and was still expanding. Gleefully, Einstein jumped to his feet, applauding. ‘This is the most beautiful and satisfactory explanation of creation to which I have ever listened’, he said. In 1933, soon after Adolf Hitler became chancellor of Germany, Einstein renounced his German citizenship and left the country. He later accepted a full-time position as a foundation member of the school of mathematics at the new Institute for Advanced Study in Princeton, New Jersey. In reprisal, Nazi storm troopers ransacked his beloved summer house at Caputh, near Berlin, and confiscated his sailboat. Einstein was so convinced that Nazi Germany was preparing for war that, to the horror of Romain Rolland and his other pacifist friends, he violated his pacifist ideals and urged free Europe to arm and recruit for defence. Although his warnings about war were largely ignored, there were fears for Einstein’s life. He was taken by private yacht from Belgium to England. By the time he arrived in Princeton in October 1933, he had noticeably aged. A friend wrote, It was as if something had deadened in him. He sat in a chair at our place, twisting his white hair in his fingers and talking dreamily about everything under the sun. He was not laughing any more. Later years in the United States In Princeton Einstein set a pattern that was to vary little for more than 20 years. He lived with his wife in a simple, two-storey frame house and most mornings walked a mile or so to the Institute, where he worked on his unified field theory and talked with colleagues. For relaxation he played his violin and sailed on a local lake. Only rarely did he travel, even to New York. In a letter to Queen Elizabeth of Belgium, he described his new refuge as a ‘wonderful little spot, · · · a quaint and ceremonious village of puny demigods on stilts’. Eventually he acquired American citizenship, but he always continued to think of himself as a European. Pursuing his own line of theoretical research outside the mainstream of physics, he took on an air of fixed serenity. ‘Among my European friends, I am now called Der grosse Schweiger (The Great Stone Face), a title I well deserve’, he said. Even his wife’s death late in 1936 did not disturb his outward calm. ‘It seemed that the difference between life and death for Einstein consisted only in the difference between being able and not being able to do physics’, wrote Leopold Infeld, the Polish physicist who arrived in Princeton at this time. Niels Bohr, the great Danish atomic physicist, brought news to Einstein in 1939 that the German refugee physicist Lise Meitner had split the uranium atom, with a slight loss of total mass that had been converted into energy. Meitner’s experiments, performed in Copenhagen, had been inspired by similar, though less precise, experiments done months earlier in Berlin by two German chemists, Otto Hahn and Fritz Strassmann. Bohr speculated that, if a controlled chain-reaction splitting of uranium atoms could be accomplished, a mammoth explosion would result. Einstein was skeptical, but laboratory experiments in the United States showed the feasibility of the idea. With a European war regarded as imminent and 256

May 11, 2004 5:16pm

D

BIOGRAPHY OF EINSTEIN

fears that Nazi scientists might build such a ‘bomb’ first, Einstein was persuaded by colleagues to write a letter to President Franklin D. Roosevelt urging ‘watchfulness and, if necessary, quick action’ on the part of the United States in atomic-bomb research. This recommendation marked the beginning of the Manhattan Project. Although he took no part in the work at Los Alamos, New Mexico, and did not learn that a nuclear-fission bomb had been made until Hiroshima was razed in 1945, Einstein’s name was emphatically associated with the advent of the atomic age. He readily joined those scientists seeking ways to prevent any future use of the bomb, his particular and urgent plea being the establishment of a world government under a constitution drafted by the United States, Britain, and Russia. With the spur of the atomic fear that haunted the world, he said ‘we must not be merely willing, but actively eager to submit ourselves to the binding authority necessary for world security’. Once more, Einstein’s name surged through the newspapers. Letters and statements tumbled out of his Princeton study, and in the public eye Einstein the physicist dissolved into Einstein the world citizen, a kind ‘grand old man’ devoting his last years to bringing harmony to the world. The rejection of his ideals by statesmen and politicians did not break him, because his prime obsession still remained with physics. ‘I cannot tear myself away from my work’, he wrote at the time. ‘It has me inexorably in its clutches’. In proof of this came his new version of the unified field in 1950, a most meticulous mathematical essay that was immediately but politely criticised by most physicists as untenable. Compared with his renown of a generation earlier, Einstein was virtually neglected and said himself that he felt almost like a stranger in the world. His health deteriorated to the extent that he could no longer play the violin or sail his boat. Many years earlier, chronic abdominal pains had forced him to give up smoking his pipe and to watch his diet carefully. Einstein died in his sleep at Princeton Hospital. On his desk lay his last incomplete statement, written to honour Israeli Independence Day. It read in part: ‘What I seek to accomplish is simply to serve with my feeble capacity truth and justice at the risk of pleasing no one’. His contribution to man’s understanding of the universe was matchless, and he is established for all time as a giant of science. Broadly speaking, his crusades in human affairs seem to have had no lasting impact. Einstein perhaps anticipated such an assessment of his life when he said, ‘Politics are for the moment. An equation is for eternity’. www.britannica.com, April 2001

257

May 11, 2004 5:16pm

E

E

SIMULTANEITY

Simultaneity

Quotation from Einstein’s 1905 paper on special relativity. The theory to be developed is based — like all electrodynamics — on the kinematics of the rigid body, since the assertions of any such theory have to do with the relationship between rigid bodies (systems of coordinates), clocks, and electromagnetic processes. Insufficient consideration of this circumstance lies at the root of the difficulties which the electrodynamics of moving bodies at present encounters. § 1. Definition of simultaneity Let us take a system of coordinates in which the equations of Newtonian mechanics hold good (footnote: i.e. to the first approximation). In order to render our presentation more precise and to distinguish this system of coordinates verbally from others which will be introduced hereafter, we call it the ‘stationary system’. If a material point is at rest relatively to this system of coordinates, its position can be defined relatively thereto by the employment of rigid standards of measurement and the methods of Euclidean geometry, and can be expressed in Cartesian coordinates. If we wish to describe the motion of a material point, we give the values of its coordinates as functions of the time. Now we must bear carefully in mind that a mathematical description of this kind has no physical meaning unless we are quite clear as to what we understand by ‘time’. We have to take into account that all our judgements in which time plays a part are always judgements of simultaneous events. If, for instance, I say ‘That train arrives here at 7 o’clock’, I mean something like this: ‘The pointing of the small hand of my watch to 7 and the arrival of the train are simultaneous events’. (footnote: We shall not here discuss the inexactitude which lurks in the concept of simultaneity of two events at approximately the same place, which can only be removed by an abstraction) It might appear possible to overcome all the difficulties attending the definition of ‘time’ by substituting ‘the position of the small hand of my watch’ for ‘time’. And in fact such a definition is satisfactory when we are concerned with defining a time exclusively for the place where the watch is located; but it is no longer satisfactory when we have to connect in time series events occurring at different places, or — what comes to the same thing — to evaluate the times of events occurring at places remote from the watch. We might, of course, content ourselves with time values determined by an observer stationed together with the watch at the origin of the coordinates, and coordinating the corresponding positions of the hands with light signals, given out by every event to be timed, and reaching him through empty space. But this coordination has the disadvantage that it is not independent of the standpoint 258

May 11, 2004 5:16pm

E

SIMULTANEITY

of the observer with the watch or clock, as we know from experience. We arrive at a much more practical determination along the following line of thought. If at the point A of space there is a clock, an observer at A can determine the time values of events in the immediate proximity of A by finding the positions of the hands which are simultaneous with these events. If there is at the point B of space another clock in all respects resembling the one at A, it is possible for an observer at B to determine the time of events in the immediate neighbourhood of B. But it is not possible without further assumption to compare, in respect of time, an event at A with an event at B. We have so far defined only an ‘A time’ and a ‘B time’. We have not defined a common ‘time’ for A and B, for the latter cannot be defined at all unless we establish by definition that the ‘time’ required by light to travel from A to B equals the ‘time’ it requires to travel from ‘B’ to ‘A’. Let a ray of light start at the ‘A time’ t A from A towards B, let it at the ‘B time’ tB be reflected at B in the direction of A, and arrive again at A at the ‘A time’ t0A . In accordance with definition the two clocks synchronise if tB − tA = t0A − tB We assume that this definition of synchronism is free from contradictions, and possible for any number of points; and that the following relations are universally valid:– 1. If the clock at B synchronises with the clock at A, the clock at A synchronises with the clock at B. 2. If the clock at A synchronises with the clock at B and also with the clock at C, the clocks at B and C also synchronise with each other. Thus with the help of certain imaginary physical experiments we have settled what is to be understood by synchronous clocks located at different places, and have evidently obtained a definition of ‘simultaneous’, or ‘synchronous’, and of ‘time’. The ‘time’ of an event is that which is given simultaneously with the event by a stationary clock located at the place of the event, this clock being synchronous, and indeed synchronous for all time determinations, with a specified stationary clock. In agreement with experience we further assume the quantity 2 AB =c t0A − tA to be a universal constant — the velocity of light in empty space. It is essential to have time defined by means of stationary clocks in the stationary system, and the time now defined being appropriate to the stationary system we call it ‘the stationary time of the stationary system’. § 2. On the relativity of lengths and times 259

May 11, 2004 5:16pm

E

SIMULTANEITY

The following reflections are based on the principle of relativity and on the principle of the constancy of the velocity of light. These two principles we define as follows:– 1. The laws by which the states of physical systems undergo change are not affected, whether these changes of state be referred to the one or the other of two systems of coordinates in uniform translatory motion. 2. Any ray of light moves in the ‘stationary’ system of coordinates with the determined velocity c, whether the ray be emitted by a stationary or by a moving body. Hence velocity =

light path time interval

where time interval is to be taken in the sense of the definition in § 1. Let there be given a stationary rigid rod; and let its length be l as measured by a measuring-rod which is also stationary. We now imagine the axis of the rod lying along the axis of x of the stationary system of coordinates, and that uniform motion of parallel translation with velocity v along the axis of x in the direction of increasing x is then imparted to the rod. We now inquire as to the length of the moving rod, and imagine its length to be ascertained by the following two operations :– (a) The observer moves together with the given measuring-rod and the rod to be measured, and measures the length of the rod directly by superimposing the measuring-rod, in just the same way as if all three were at rest. (b) By means of stationary clocks set up in the stationary system and synchronising in accordance with § 1, the observer ascertains at what points of the stationary system the two ends of the rod to be measured are located at a definite time. The distance between these two points, measured by the measuring-rod already employed, which in this case is at rest, is also a length which may be designated ‘the length of the rod’. In accordance with the principle of relativity the length to be discovered by (a) — we will call it ‘the length of the rod in the moving system’ — must be equal to the length l of the stationary rod. The length to be discovered by the operation (b) we will call ‘the length of the (moving) rod in the stationary system’. This we shall determine on the basis of our two principles, and we shall find that it differs from l. Current kinematics tacitly assumes that the lengths determined by these two operations are precisely equal, or in other words, that a moving rigid body at the epoch t may in geometrical respects be perfectly represented by the same body at rest in a definite position. We imagine further that at the two ends A and B of the rod, clocks are placed which synchronise with the clocks of the stationary system, that is to say that 260

May 11, 2004 5:16pm

E

SIMULTANEITY

their indications correspond at any instant to the ‘time of the stationary system’ at the places where they happen to be. These clocks are therefore ‘synchronous in the stationary system’ We imagine further that with each clock there is a moving observer, and that these observers apply to both clocks the criterion established in § 1 for the synchronisation of two clocks. Let a ray of light depart from A at the time t A (footnote: ‘Time’ here denotes ‘time of the stationary system’ and also ‘position of hands of the moving clock situated at the place under discussion’.), let it be reflected at B at time tB , and reach A again at the time t0A . Taking into consideration the principle of the constancy of the velocity of light we find that tB − t A =

rAB c−v

and

t0A − tB =

rAB c+v

where rAB denotes the length of the moving rod — measured in the stationary system. Observers moving with the moving rod would thus find that the two clocks were not synchronous, while observers in the stationary system would declare the clocks to be synchronous. So we cannot attach any absolute signification to the concept of simultaneity, but that two events which, viewed from a system of coordinates, are simultaneous, can no longer be looked upon as simultaneous events when envisaged from a system which is in motion relatively to that system.

261

May 11, 2004 5:16pm

F

F

ELECTROMAGNETISM USING SI UNITS

Electromagnetism using SI units

These are the microscopic equations i.e. in a vacuum (with no media), using SI units.

Speed of light Two constants occur in electromagnetism: µ o and o . The constant µo is the permeability of free space (µ o = 4π × 10−7 H m−1 exact). H stands for henry. This is the SI unit of inductance. In terms of other SI units H = Wb A −1 = V s A−1 where Wb is the weber (unit of magnetic flux), A is the ampere (unit of electrical current) and V is volt (unit of electric potential). The constant o is the permittivity of free space ( o = 8.854187817 . . . × 10−12 F m−1 defined). F stands for farad. This is the SI unit of capacitance. In terms of other SI units F = C V−1 = A s V−1 where C is the coulomb (unit of electric charge). These constants satisfy µo o =

1 c2

where c has the dimensions of velocity and is the speed of light (c = 2.99792458 × 10 8 m s−1 exact).

Maxwell equations:

∇·E =

Coulomb law Faraday law Amp`ere law ρ E

∇·B =0 ∇ × E + ∂B = 0

∇×B−

charge density electric field

1 o ρ

j B

∂t µo o ∂∂tE

= µo j

current density magnetic induction

Continuity equation: ∂ρ +∇·j =0 ∂t

262

May 11, 2004 5:16pm

F

ELECTROMAGNETISM USING SI UNITS

Lorentz force: F = q[E + v × B] Point charge q moves with velocity v.

Electromagnetic potentials: B =∇×A

E = −∇Φ −

∂A ∂t

Gauge transformation: Φ A

)

−→

  Φ−

∂χ ∂t

where χ is arbitrary

 A + ∇χ

Lorentz gauge: The potentials satisfy the Lorentz gauge condition 1 ∂Φ +∇·A=0 c2 ∂t and inhomogeneous wave-equations !

!

1 ∂2 1 − ∇2 Φ = ρ 2 2 c ∂t o

1 ∂2 − ∇2 A = µ o j c2 ∂t2

4-current: j µ = ρo v µ = (cρ, j) where ρ = ρo γv is the charge density and j = ρv is the current density. ρ o is the proper charge density i.e. as measured in the rest frame.

4-potential: µ

A =

1 Φ, A c

263

May 11, 2004 5:16pm

F

ELECTROMAGNETISM USING SI UNITS

Differential operators: ∂ ∂µ = = ∂xµ

1∂ ,∇ c ∂t

1∂ ∂ = − ,∇ c ∂t

µ

∂ α ∂α = −

continuity equation

∂µ j µ = 0

Lorentz gauge condition

∂ µ Aµ = 0

1 ∂2 + ∇2 c2 ∂t2

wave-equations ∂ α ∂α Aµ = −µo j µ

Electromagnetic field tensor: Fµν

Fµν

= ∂ µ Aν − ∂ ν Aµ = −Fνµ

0 − 1c Ex − 1c Ey − 1c Ez  0 Bz −By  =  0 Bx 0 

    

F µν

   

= 

0

1 c Ex

0

1 c Ey Bz

0

1 c Ez −By

Bx 0

    

Maxwell equations in 4-tensor form: ∂µ Fνζ + ∂ν Fζµ + ∂ζ Fµν = 0

∂ν F µν = µo j µ

Lorentz force: The 4-force F µ which acts on a charged particle (charge q, 4-velocity v µ ) moving in an electromagnetic field (Fµν ) is F µ = q F µν vν

264

May 11, 2004 5:16pm

F

ELECTROMAGNETISM USING SI UNITS

Dual electromagnetic tensor density: G µν

G µν



0 Bx By Bz  0 − 1c Ez 1c Ey  =  0 − 1c Ex 0

    

1 2

=

eµναβ Fαβ

= −G νµ

and

Gµν

∂µ Fνζ + ∂ν Fζµ + ∂ζ Fµν = 0

≡



0 −Bx −By −Bz  0 − 1c Ez 1c Ey  =  0 − 1c Ex 0

    

∂ν G µν = 0

Invariants:

F µν Fµν = −G µν Gµν = 2 B 2 −

1 2 c2 E

−→ (B 2 −

1 2 c2 E )

is a scalar

F µν Gµν = − 4c E · B −→ E · B is a scalar density

Transformation of fields: For a boost ui: E x = Ex

E y = γ(Ey − uBz )

B x = B x B y = γ By +

u c2 E z

E z = γ(Ez + uBy )

B z = γ Bz −

u c2 E y

For a boost u: γ−1 (u · E)u + γ (u × B) u2 γ γ−1 B = γB − (u · B)u − 2 (u × E) u2 c E = γE −

265

May 11, 2004 5:16pm

F

ELECTROMAGNETISM USING SI UNITS

Electromagnetic energy tensor: T µν =

T

µν

=T

1 µo

νµ

ηαβ F µα F νβ −

Tµµ

=0

(T

1 4

µν

η µν Fαβ F αβ

)=

U 1 T cN

1 cN Pij

!

Energy density: 1 2µo

N =

1 µo

U= Poynting vector:

1 c2

E 2 + B2

(E × B)

Maxwell stress tensor: Pij =

1 µo

h

1 2

δij

1 c2

E 2 + B2 −

1 c2

Ei Ej + B i Bj

i

where

i, j = 1, 2, 3

∂ν T µν = −F µν jν Poynting’s theorem: µ=0

−→

∂U + ∇ · N = −j · E ∂t

266

May 11, 2004 5:16pm

G

GLOBAL POSITIONING SYSTEM

G G.1

Global Positioning System Einstein’s Relativity and Everyday Life by Clifford M. Will What good is fundamental physics to the person on the street?

This is the perennial question posed to physicists by their non-science friends, by students in the humanities and social sciences, and by politicians looking to justify spending tax dollars on basic science. One of the problems is that it is hard to predict definitely what the payback of basic physics will be, though few dispute that physics is somehow ”good.” Physicists have become adept at finding good examples of the long-term benefit of basic physics: the quantum theory of solids leading to semiconductors and computer chips, nuclear magnetic resonance leading to MRI imaging, particle accelerators leading to beams for cancer treatment. But what about Einstein’s theories of special and general relativity? One could hardly imagine a branch of fundamental physics less likely to have practical consequences. But strangely enough, relativity plays a key role in a multi-billion dollar growth industry centred around the Global Positioning System (GPS). When Einstein finalised his theory of gravity and curved space-time in November 1915, ending a quest which he began with his 1905 special relativity, he had little concern for practical or observable consequences. He was unimpressed when measurements of the bending of starlight in 1919 confirmed his theory. Even today, general relativity plays its main role in the astronomical domain, with its black holes, gravity waves and cosmic big bangs, or in the domain of the ultra-small, where theorists look to unify general relativity with the other interactions, using exotic concepts such as strings and branes. But GPS is an exception. Built at a cost of over $10 billion mainly for military navigation, GPS has rapidly transformed itself into a thriving commercial industry. The system is based on an array of 24 satellites orbiting the earth, each carrying a precise atomic clock. Using a hand-held GPS receiver which detects radio emissions from any of the satellites which happen to be overhead, users of even moderately priced devices can determine latitude, longitude and altitude to an accuracy which can currently reach 15 meters, and local time to 50 billionths of a second. Apart from the obvious military uses, GPS is finding applications in airplane navigation, oil exploration, wilderness recreation, bridge construction, sailing, and interstate trucking, to name just a few. Even Hollywood has met GPS, recently pitting James Bond in ”Tomorrow Never Dies” against an evil genius who was inserting deliberate errors into the GPS system and sending British ships into harm’s way. But in a relativistic world, things are not simple. The satellite clocks are moving at 14,000 km/hr in orbits that circle the Earth twice per day, much faster than clocks on the surface of the Earth, and Einstein’s theory of special relativity says that rapidly moving clocks tick more slowly, by about seven microseconds (millionths of a second) per day. Also, the orbiting clocks are 20,000 km above the Earth, and experience gravity that is four times weaker than that on the ground. Einstein’s general relativity theory says that gravity curves space and time, resulting in a tendency for the orbiting clocks to tick slightly faster,

267

May 11, 2004 5:16pm

G

GLOBAL POSITIONING SYSTEM

by about 45 microseconds per day. The net result is that time on a GPS satellite clock advances faster than a clock on the ground by about 38 microseconds per day. To determine its location, the GPS receiver uses the time at which each signal from a satellite was emitted, as determined by the on-board atomic clock and encoded into the signal, together with the speed of light, to calculate the distance between itself and the satellites it communicated with. The orbit of each satellite is known accurately. Given enough satellites, it is a simple problem in Euclidean geometry to compute the receiver’s precise location, both in space and time. To achieve a navigation accuracy of 15 meters, time throughout the GPS system must be known to an accuracy of 50 nanoseconds, which simply corresponds to the time required for light to travel 15 meters. But at 38 microseconds per day, the relativistic offset in the rates of the satellite clocks is so large that, if left uncompensated, it would cause navigational errors that accumulate faster than 10 km per day! GPS accounts for relativity by electronically adjusting the rates of the satellite clocks, and by building mathematical corrections into the computer chips which solve for the user’s location. Without the proper application of relativity, GPS would fail in its navigational functions within about 2 minutes. So the next time your plane approaches an airport in bad weather, and you just happen to be wondering ”what good is basic physics?”, think about Einstein and the GPS tracker in the cockpit, helping the pilots guide you to a safe landing.

Clifford M. Will is Professor and Chair of Physics at Washington University in St. Louis, and is the author of Was Einstein Right?. In 1986 he chaired a study for the Air Force to find out if they were handling relativity properly in GPS. They were.

Will’s homepage is at: http://wugrav.wustl.edu/People/CLIFF/ The article was taken from: http://www.physicscentral.com/writers/

G.2

General relativity in the global positioning system by Neil Ashby

The Global Position System (GPS) consists of 24 earth-orbiting satellites, each carrying accurate, stable atomic clocks. Four satellites are in each of six different orbital planes, of inclination 55 degrees with respect to earth’s equator. Orbital periods are 12 hours (sidereal), so that the apparent position of a satellite against the background of stars repeats in 12 hours. Clock-driven transmitters send out synchronous time signals, tagged with the position and time of the transmission event, so that a receiver near the earth can determine its position and time by decoding navigation messages from four satellites to find the transmission event coordinates, and then solving four simultaneous one-way signal propagation equations. Conversely, gamma-ray detectors on the satellites could determine the space-time coordinates of a nuclear event by measuring signal arrival times and solving four one-way propagation delay equations. 268

May 11, 2004 5:16pm

G

GLOBAL POSITIONING SYSTEM

Apart possibly from high-energy accelerators, there are no other engineering systems in existence today in which both special and general relativity have so many applications. The system is based on the principle of the constancy of c in a local inertial frame: the Earth-Centred Inertial or ECI frame. Time dilation of moving clocks is significant for clocks in the satellites as well as clocks at rest on earth. The weak principle of equivalence finds expression in the presence of several sources of large gravitational frequency shifts. Also, because the earth and its satellites are in free fall, gravitational frequency shifts arising from the tidal potentials of the moon and sun are only a few parts in 10 16 and can be neglected. The Sagnac effect has an important influence on the system. Since most GPS users are at rest or nearly so on earth’s surface, it would be highly desirable to synchronise clocks in a rotating frame fixed to the earth (an Earth-Fixed, Earth-Centred Frame or ECEF Frame). However because the earth rotates, this is prevented by the Sagnac effect, which is large enough in the GPS to be significant. Inconsistencies occurring in synchronisation processes conducted on the Earth’s surface by using light signals, or with slowly moving portable clocks, are path-dependent and can be many dozens of nanoseconds, too large to tolerate in the GPS. Thus the Sagnac effect forces a different choice for synchronisation convention. Also, the path of a signal in the ECEF is not ”straight.” In the GPS, synchronisation is performed in the ECI frame; this solves the problem of path-dependent inconsistencies. Several sources of relativistic effects enter in determining the unit of time, the SI second as realized by the U. S. Naval Observatory (USNO). For a clock fixed on earth, time dilation arising from earth’s spinning motion can be viewed alternatively as a contribution, in the ECEF frame, to the total effective gravitational potential which also includes contributions arising from earth’s non-sphericity. Earth-fixed clocks placed on the same equipotential surface of this effective field all beat at the same rate. Over the span of geological time, the earth’s figure has distorted so that it nearly matches one of these gravitational equipotentials–the earth’s geoid at mean sea level. The SI second is defined by the rate of atomic clocks on the geoid. This rate is determined to sufficient accuracy, relative to clocks at infinity, by three effects: time dilation due to earth’s spin, and frequency shifts due to the monopole and quadrupole potentials of earth. In General Relativity (GR), coordinate time, such as is expressed approximately by a slowmotion, weak-field metric, covers the solar system. The proper time elapsed on a moving clock depends on the clock’s position and velocity in the fields of nearby masses, and can be computed in terms of the elapsed coordinate time if the velocities, positions, and masses are known. Conversely, the elapsed coordinate time can be computed by integrating corrections to the proper time. The concept of coordinate time in a local inertial frame is established for the GPS as follows. In the local ECI frame, imagine a network of atomic clocks at rest and synchronised using constancy of c. To each real, moving clock apply corrections to yield a paper clock which then agrees with one of these hypothetical clocks in the underlying inertial frame, with which the moving clock instantaneously coincides. The time resulting from such corrections is then a coordinate time, free from inconsistencies, whose rate is determined by clocks at rest on the earth’s rotating geoid. Relativistic effects on satellite clocks can be combined in such a way that only two correc269

May 11, 2004 5:16pm

G

GLOBAL POSITIONING SYSTEM

tions need be considered. First, the average frequency shift of clocks in orbit is corrected downward in frequency by 446.47 parts in 10 12 . This is a combination of five different sources of relativistic effects: gravitational frequency shifts of ground clocks due to earth’s monopole and quadrupole moments, gravitational frequency shifts of the satellite clock, and second-order Doppler shifts from motion of satellite and earth-fixed clocks. Second, if the orbit is eccentric, an additional correction arises from a combination of varying gravitational and motional frequency shifts as the satellite’s distance from earth varies. This correction is periodic and is proportional to the orbit eccentricity. For an eccentricity of .01, the amplitude of this term is 23 ns. Due to a shortage of computer resources on satellites in the early days of GPS, it was decided that this latter correction was to be the responsibility of software in GPS receivers. It is a correction which must be applied to the broadcast time of signal transmission, to obtain the coordinate time epoch of the transmission event in the ECI frame. At the time of launch of the first NTS-2 satellite (June 1977), which contained the first Cesium clock to be placed in orbit, there were some who doubted that relativistic effects were real. A frequency synthesiser was built into the satellite clock system so that after launch, if in fact the rate of the clock in its final orbit was that predicted by GR, then the synthesiser could be turned on bringing the clock to the coordinate rate necessary for operation. The atomic clock was first operated for about 20 days to measure its clock rate before turning on the synthesiser. The frequency measured during that interval was +442.5 parts in 1012 faster than clocks on the ground; if left uncorrected this would have resulted in timing errors of about 38, 000 nanoseconds per day. The difference between predicted and measured values of the frequency shift was only 3.97 parts in 10 12 , well, within the accuracy capabilities of the orbiting clock. This then gave about a 1% validation of the combined motional and gravitational shifts for a clock at 4.2 earth radii. At present one cannot easily perform tests of relativity with the system because the SV clocks are actively steered to be within 1 microsecond of Universal Coordinated Time (USNO). Several relativistic effects are too small to affect the system at current accuracy levels, but may become important as the system is improved; these include gravitational time delays, frequency shifts of clocks in satellites due to earth’s quadrupole potential, and space curvature. This system was intended primarily for navigation by military users having access to encrypted satellite transmissions which are not available to civilian users. Uncertainty of position determination in real time by using the Precise Positioning code is now about 2.4 meters. Averaging over time and over many satellites reduces this uncertainty to the point where some users are currently interested in modelling many effects down to the millimetre level. Even without this impetus, the GPS provides a rich source of examples for the applications of the concepts of relativity. New and surprising applications of position determination and time transfer based on GPS are continually being invented. Civilian applications include for example, tracking elephants in Africa, studies of crustal plate movements, surveying, mapping, exploration, salvage in the open ocean, vehicle fleet tracking, search and rescue, power line fault location, and 270

May 11, 2004 5:16pm

G

GLOBAL POSITIONING SYSTEM

synchronisation of telecommunications nodes. About 60 manufacturers now produce over 350 different commercial GPS products. Millions of receivers are being made each year; prices of receivers at local hardware stores start in the neighbourhood of $200.

Neil Ashby is at the University of Colorado.

The article was taken from: the newsletter of the Topical Group in Gravitation of the American Physical Society, Number 9, Spring 1997. See: http://www.phys.lsu.edu/mog/ Ashby’s homepage is at: http://physics.colorado.edu/directory/faculty/ashby n.html

271

May 11, 2004 5:16pm

G

GLOBAL POSITIONING SYSTEM

G.3

Some Notes

The operation of GPS relies on synchronisation between the clocks on a satellite and at the surface of the Earth. The two dominant effects from relativity are firstly due to the speed of the satellite and secondly its position in a weaker gravitational field. 1. The first effect is due to time dilation. If S and S are inertial frames such that S moves relative to S with speed u in the x-direction then time- and space-intervals transform according to the Lorentz transformation u ∆t = γ ∆t − 2 ∆x c ∆x = γ (∆x − u∆t) q

where γ = 1/ 1 −

u2 c2 .

(1021)

u ∆t = γ ∆t + 2 ∆x c ∆x = γ (∆x + u∆t)

(1022)

The inverse is

We now suppose that we are measuring time intervals on a clock that is at rest in S. Then ∆x = 0 and the time interval ∆t is known as the proper time denoted ∆τ . Therefore we obtain from the above equations ∆t = γ ∆τ

and

∆x = u γ ∆τ

(1023)

u2 ∆t c2

(1024)

From the first equation we have ∆τ

s

=

1−

≤ ∆t A q clock moving uniformly with speed u through an inertial frame goes slow by a factor 2 1 − uc2 relative to a stationary clock in the frame.

Applying this to the satellite and Earth clocks we have Tsat = ≈

s

1−

u2 Tearth c2

u2 1− 2 2c

!

(1025)

Tearth

or Tsat − Tearth = −

u2 Tearth 2c2

(1026)

Taking u = 3900 m/s, c = 3 × 108 m/s and Tearth = 1 day = 8.64 × 1010 microseconds gives Tsat − Tearth = −7.3 microseconds 272

(1027) May 11, 2004 5:16pm

G

GLOBAL POSITIONING SYSTEM

2. The second effect arises from general relativity. For clocks at rest in a gravitational field dτ =

r

1+

2ϕ dt c2

(1028)

where ϕ is the gravitational potential. When ϕ = 0 we have dτ = dt so that the coordinate time t is identified with a clock in a gravity-free region. The potential due to a mass M is ϕ=−

GM r

(1029)

where G is the gravitational constant. So that dτ =

s

1−

2GM dt c2 r

(1030)

If A and B are two points in the field with r A < rB then equating the coordinate time interval gives dτB

=

q

q

1−

2GM c2 rB

1−

2GM c2 rA

dτA

GM 1+ ≈ 1− 2 c rB GM 1 ≈ 1+ 2 − c rA

(1031) GM dτA c2 rA 1 dτA rB

giving dτB − dτA ≈

GM rB − rA dτA c2 rA rB

(1032)

Therefore dτB > dτA which implies that clocks go faster in a weaker gravitational field. Applying this to the clocks on the satellite and the Earth gives Tsat − Tearth ≈

GM rB − rA Tearth c2 rA rB

(1033)

Using G = 6.673 × 10−11 N m2 kg−2 , c = 3 × 108 m/s, M = 6 × 1024 kg gives GM/c2 = .0044 m. With rA = 6.37 × 106 m, rB = 26.37 × 106 m and Tearth = 1 day = 8.64 × 1010 microseconds gives Tsat − Tearth = 45.3 microseconds

(1034)

Combining the two effects gives Tsat − Tearth = 38 microseconds

(1035)

so that each day the satellite clock is faster by 38 microseconds. 273

May 11, 2004 5:16pm

Tensor relations from bivector field equation.