In my last post on this topic, I explained that there are two distinct ways of expressing the relationship between mass and energy in special relativity. One is known as concomitance, where mass and energy are really the same thing, and different observers moving at a constant speed relative to each other will disagree on the magnitudes of both quantities. The other is the interconvertibility approach: mass and energy can be converted into each other, mass being Lorentz-invariant (all inertial observers agree on its magnitude) while energy is not. In the former view, the equation E = mc2 expresses the identity between the two concepts and can be applied to any macroscopic body, while in the latter it is a conversion formula used to calculate how much energy a given mass can be converted to, and vice versa.

I learnt about this from Stuart Leadstone, a friend from the History of Physics Group, with whom I entered into a correspondence some time ago on a different topic, which somehow turned into a discussion of special relativity. Stuart and I disagree on the “correct” approach to energy and mass (me having learnt my physics from the interconvertibility school, while he is an unreconstructed “concomitanceist”) but we both agreed on the absurdity of texts which appear to simultaneously hold both of these incompatible viewpoints; it was he who sent me the textbook example where both are used on the same page.

I had actually come across one aspect of this issue some time previously, when I picked up a popular science book called The Black Hole War at a friend’s house and, reading from it at random, stumbled across the author’s explanation of E = mc2 in terms of a very small increase in weight (which is far too small, however, for our current best weighing devices to detect) when an object is heated. This is a logical consequence of concomitance – as long as one is prepared to assume that relativistic mass is affected by gravity in the same way as rest mass – but it seemed odd to me at that time because I had no idea that the concomitance view was still so popular. And the author of this book ought to know what he is talking about, since he is Leonard Susskind, a professor of theoretical physics.

For some reason, this made me want to check back with the textbook I had learnt my relativity from; and I was horrified to see the same “mass of heat” example in there too (I had clearly not read it properly first time round); in fact this version was a whole lot worse because it had been deduced using the interconvertibility approach, in which mass and energy are not regarded as simply “the same thing”.

It was done by a sort of mathematical sleight of hand. Recall that the interconvertibility version of Einstein’s equation is

E2 = p2c2 + m2c4

and this equation had indeed been used, but then the book argued that since the body being heated was at rest in the laboratory frame, p=0 and hence the equation simplifies to E = mc2 and, hey presto, the increase in energy produced by heating the object has led to a corresponding increase in mass.

The fallacy here is not too difficult to spot. Of course the body is not at rest in the laboratory frame! It is composed of billions upon billions of atoms or molecules which are all jiggling around, and this jiggling, which of course is what we call heat, corresponds to a non-zero p at the scale of individual molecules. It’s not clear how one could apply Einstein’s equation to such a complicated ensemble of particles.

And this leads us to a crucial realisation – the bodies on which Einstein based his derivation of SR are far too simple to represent actual bodies in anything but a most rudimentary way; in particular, they have no internal structure, and so are simply incapable of heating up. We cannot apply equations derived from such a model to a real-life body which has properties absent in the model.

This is not something that is peculiar to SR. I can remember being very confused, way back when I was doing applied maths at school, that in some collisions, called elastic collisions, both linear momentum and kinetic energy were conserved, while there were other collisions, called inelastic collisions, where one of these conservations does not apply, and to be honest I spent an awful lot of time trying to remember which one it was. It was kinetic energy of course – linear momentum is always conserved – but under certain circumstances you could have a collision, say between two lumps of putty, where the two lumps coalesce and kinetic energy is not conserved. Of course there is permanent deformation of one or both bodies, and this produces heat, so the overall energy is conserved. But the important thing to realise is that you cannot predict this behaviour from anything intrinsic to the model – you have to have some sort of additional information, such as that one body is made of putty. And the models we used for these sorts of problem were sort of idealised billiard balls, which do not really correspond very closely to anything real; rather like Einstein’s “rigid bodies”, in fact.

This problem with SR can be stated in a somewhat different way by describing it as a principle theory as opposed to a constructive theory. Principle theories are based on principles, from which other things can be logically derived. (Do not be confused by Einstein’s use of the term “thought experiment”; these were simple logical deductions, and no actual experiments were done). Constructive theories predict the behaviour of macroscopic objects in terms of their microscopic constituents (atoms). The archetypal example of this is the physics of gases. Thermodynamics is a principle theory, based on certain laws, and describing the bulk properties of macroscopic volumes of gas. Kinetic theory, on the other hand, seeks to explain thermodynamics in terms of the motions of gas molecules; it is a constructive theory, which underpins the seemingly arbitrary assumptions of thermodynamics, and explains it.

But there is no constructive theory to underpin special relativity. Einstein was aware of this; as early as 1908, he wrote that “a physical theory can only be satisfactory when it builds up its structures from elementary foundations”. Of the idealised measuring-rods and clocks from which special relativity was derived, he later said that strictly speaking they should be “represented as solutions of the basic equations (objects consisting of moving atomic configurations), not, as it were, as theoretically self-sufficient entities.”  But to my knowledge, nobody has yet done this.

Philosophers of science are aware of this problem. Harvey Brown has written widely on the topic, notably in a 2005 paper entitled Einstein’s Misgivings about his 1905 Formulation of Special Relativity. But Brown’s main concern here seems to be with considerations of aesthetics and completeness; he does not seem to regard a constructive theory as an essential component so much as a merely desirable one. And Dennis Dieks has written about “bottom-up” (constructive) and “top-down” (principle) approaches to SR in the context of pluralism – he says that it is a matter of pragmatics, and “there is no uniquely best way of explaining the relativistic effects”.

How, though, would we go about building a constructive theory? I take it that we would need to regard the Lorentz transformations – the equations that relate measurements by different observers to one another – as true only of elementary particles (atoms? electrons?) and then build a slightly more complicated model consisting of two particles bound together (a molecule) and work up from there. (Even this would be verging dangerously on over-simplification, since an atom can absorb a mechanical impact by a process of excitation, which, for completeness, should also be built into the model).

But hold on a minute. Can we even assume the Lorentz transformations hold for atoms? The classic derivation involves one observer sending a light signal to another observer, moving at a constant speed relative to the first observer, and the second observer reflecting the signal back to the first observer who then records the time at which the reflected signal arrives. But you can’t just “reflect” light from individual atoms or electrons – the incident photon will transfer momentum to the “observer”, which is then no longer moving in the same inertial frame, and it is not clear that the photon will simply “bounce back”. In fact you probably need to involve quantum mechanics. So this would be a decidedly non-trivial exercise.

And yet one thing we can be fairly sure about is that SR does work at the sub-atomic scale. The effect of time dilation (a consequence of the theory) on the lifetimes of muons has been well researched, and found to be consistent with SR – in fact it was probably the first experimental verification of the theory. But I for one would want a lot more convincing that we can justify the applicability of the full theory at this level.