A topic in another thread made me think of how easy it is to miss (without knowing it) so much that is vital for those interested in metaphysical or philosophical perspectives concerning the nature of reality (e.g., materialism, physicalism, dualism, etc.), the nature and philosophy of the sciences and of physics, and related topics. Specifically, the role mathematics plays in “fundamental” physical theories, from the well-established to the highly theoretical, is incredibly understanding reality, and is something that few physicists and far fewer mathematicians ever deal with. Most physicists don’t know much about quantum mechanics let alone the far more difficult quantum field theory (the heart of the standard model of particle physics), they aren’t theoretical physicists or cosmologists trying to create GUTs like string theory or M-theory, and are wont to view the mathematics used in “fundamental” physics as basically playing the same role mathematics did and does in Newtonian Mechanics or classical electromagnetism. Of course, scientists and non-scientists are even more likely to be unaware of the vital role mathematics plays in how we understand the most fundamental “layer(s)” of reality.
For example, in classical physics as in e.g., QM & QFT, we have notions like velocity, momentum, position, mass, etc. In other words, we study systems that have measurable/observable properties we call “observables”. Thus, in classical physics, when we say that a system has a mass of 1 kilogram or speed of 1000 mps, it’s clear how the values “1” or “1000” correspond to the system. True, we could change the values to various equivalents by e.g., expressing speed in terms of fps or mph, but whatever the value, there is a 1-to-1 (direct) relationship between that value and the observable (property) we wish it to represent.
This stops being true long before untested GUTs like M-theory. It is, in a very real sense, at the heart of quantum mechanics and in a certain way it IS QFT/particle physics. It is well-known that classical mechanics breaks down at very small (and large) levels, and also well-known that we can “reduce” classical mechanics to quantum mechanics but the reverse is not true. Unfortunately, when the creators of quantum theory developed quantum mechanics and QED (the earliest quantum field theory) in the early-to-mid 20th century, they retained the language of classical physics but changed the meanings of the terms used. I will use a simple description of an idealized, abstract, and simplified experiment in quantum mechanics to illustrate.
Even those who know very little about mathematics or physics are often aware that things like wave functions and the Schrödinger equation play a large role in QM. They may even know that the wave function is one way of expressing the state of a system in quantum mechanics. But they usually don’t know how physicists derive wave functions in any given experiment. To understand this requires a bare-bones description of such experiments, and the best I know of is from section II of Stapp’s paper The Copenhagen Interpretation (American Journal of Physics Vol. 40), which can also be found in section 3.2 of his Mind, Matter, and Quantum Mechanics (3rd Ed.): “Quantum theory is a procedure by which scientists predict probabilities that measurements of specified kinds will yield results of specified kinds in situations of specified kinds.” Concise? Yes. Informative? Hardly. But luckily there’s more:
“First, [the experimental physicist] transforms his information about the preparation of the system into an initial wave function. Then he applies to it some linear transformation, calculated perhaps from the Schrödinger equation, or perhaps from the S matrix, which converts the initial wave function into a final wave function. This final wave function, which is built on the degrees of freedom of the measured system, is then folded into the wave function corresponding to a possible result…
The essential points are that attention is focused on some system that is prepared in a specified manner and later examined in a specified manner. Quantum theory is a procedure for calculating the predicted probability that the specified type of examination will yield some specified result. This predicted probability is the predicted limit of the relative frequency of occurrence of the specified result, as the number of systems prepared and examined in accordance with the specifications goes to infinity.“
Now to unpack the above a bit. First, notice that this “procedure” which describes how quantum theory is used involves two wave functions (or their mathematical equivalents). It may seem like the “initial” one is similar to some mathematical description you might find in classical physics: you set up your system and use math to describe it (e.g., the mathematical description of the system’s state). But why do we need another wave function in order to describe the “measured system”, why isn’t the “measured system” the same as the system we prepared, and why doesn’t either wave function give us the state of any system? Second, what’s “relative frequency” and what does “goes to infinity” mean? The key to the first question is that, in QM, when we “prepare” a system we fundamentally disturb it-repeatedly. In fact, we keep disturbing it over and over again to give us a kind of “average” value of measurements that we collectively call “preparation”, and then transcribe this into a wave function that doesn’t correspond in any known way to whatever is left after we fundamentally disturbed the “system” by “preparing it. Since we no longer know what our wave function describes in terms of anything physical, we can’t predict anything unless we can relate the wave function to a very specific kind of later “measurement”. So specific, in fact, that we need to construct another “system” which doesn’t exist. Instead, it tells us that, given we disturbed X particular system in Y specified manner, we can use Z specifications to determine the probability that X, Y, and Z will give us the same result (this is the “relative frequency” part: imagine flipping a fair coin. The probability of getting heads given a fair coin flip is not 50% because given any n such tosses we will always get heads 50% of the time but because we would if we let n go to infinity. Likewise, given Y specifications of X with Z measurement specifications n times, we figure out which outcomes occur which percent of the time if we let n go to infinity).
That, in essence, is quantum theory. The problem, though, is that at no time have we used a mathematical description of our physical system(s) that corresponds to anything physical, but rather to wave functions we’ve mathematically derived from taking a “system”, messing it up repeatedly and describing the disturbances as “systems”, and this allows us to plug these mathematical systems into equations that yield the probabilities which makes QM perhaps the most successful theory of all time.
But it gets worse. As in classical physics, we describe “measurements” in terms of “observables”, and we label most of these “observables” the same way (e.g., “momentum”, “position”, etc.). However, in QM our “systems” are mathematical entities. Clearly, we can’t measure/observe the position or momentum of a mathematical function. So “observables” in QM are actually kinds of mathematical functions called operators. Thus the “momentum operator” doesn’t tell us the momentum of anything. It tells us that given a particular system “prepared” in a particular way and “measured” in a particular way, applying the “observable” operator (function) will yield a particular value with a particular probability (i.e., we can’t even just “apply” the operator and get a value we then call momentum, because we don’t get a value).
It gets even worse. We may not know what we are describing with QM, as our “systems” are mathematical and our “observable” properties are mathematical functions, but we at least we derived the mathematical structures experimentally. It’s not as if we started out with the mathematical structure/formalism of QM and then applied it; we started out with classical physics and gradually found through experiments that reality itself appears fundamentally “unsharp” or otherwise “vague” in numerous ways such that the only way to investigate reality at this level was by describing how forcing it to be “non-vague” obeys certain probabilistic tendencies.
This is not true of QFT, and therefore the whole standard model of particle physics. Quantum mechanics is not a relativistic physics, but special relativity has been confirmed experimentally (and Einstein formulated it in 1905 based on experimental findings). QM tells us that the energy of physical systems can fluctuate incredibly wildly. Consider a positron: it is “created” e.g., by a single electron scattering off of something (like a nucleus or proton or whatever) which causes the creation of the positron. Why? Because such an electron is energetic enough for special relativity to require an “antielectron” to conserve the energy of the system, and no Schrödinger equation can describe a system in which “energy” produces (is converted to) “mass” like this. For that we need relativistic quantum mechanics, and QFT is the most complete relativistic quantum theory (hence its status as the foundation of the standard model). But how did we develop the mathematical structures that tell us under what circumstances such a positron is produced? We took the mathematics of QM, took the mathematics of special relativity, and messed around with the mathematical structure of QM until it could incorporate that from special relativity. In other words, it wasn’t like QM stopped working the way classical physics did and we needed something more which we were able to derive experimentally. After all, the so-called “Copenhagen interpretation” dates from 1927, over 20 years after Einstein published his paper on special relativity, and even before developing his famous wave equation Schrödinger tried his hand at a relativistic equation. Everybody involved in the development of QM and its extensions knew this was a problem (and not the only one). The real issue is that we “fixed” the deficiencies of QM mathematically, not through experiments. Thus things like positrons weren’t discovered empirically, but are things we needed to balance equations. This is why we predicted the Higgs Boson and searched for it for years, CERN said they found it in 2012, and the research on what we found continues: we created the standard model mostly via mathematical extensions of a theory (QM) that deals with mathematical, not physical, systems.
So, what is the relationship between physical reality and mathematics? Well, with quantum mechanics we know that the things we call physical systems aren’t physical, but at least we are clear about the procedures we use, allowing a minimalist interpretation (the orthodox position in which QM is deemed irreducibly statistical), no matter how unsatisfactory many find this to be. Most of modern “fundamental” physics consists of extensions to QM (in which almost none of the particles of the standard model can exist), which we construct mathematically, not empirically. Basically, most of modern physics that involves the nature of reality treats reality as mathematical constructs. So much so, in fact, that information theory has been increasingly used and adopted as the only suitable language, context, and “ontological” framework for modern physics, because information theory is statistical, non-physical, and perfectly suited to describing “reality” as depicted in modern physics: non-physical.
For example, in classical physics as in e.g., QM & QFT, we have notions like velocity, momentum, position, mass, etc. In other words, we study systems that have measurable/observable properties we call “observables”. Thus, in classical physics, when we say that a system has a mass of 1 kilogram or speed of 1000 mps, it’s clear how the values “1” or “1000” correspond to the system. True, we could change the values to various equivalents by e.g., expressing speed in terms of fps or mph, but whatever the value, there is a 1-to-1 (direct) relationship between that value and the observable (property) we wish it to represent.
This stops being true long before untested GUTs like M-theory. It is, in a very real sense, at the heart of quantum mechanics and in a certain way it IS QFT/particle physics. It is well-known that classical mechanics breaks down at very small (and large) levels, and also well-known that we can “reduce” classical mechanics to quantum mechanics but the reverse is not true. Unfortunately, when the creators of quantum theory developed quantum mechanics and QED (the earliest quantum field theory) in the early-to-mid 20th century, they retained the language of classical physics but changed the meanings of the terms used. I will use a simple description of an idealized, abstract, and simplified experiment in quantum mechanics to illustrate.
Even those who know very little about mathematics or physics are often aware that things like wave functions and the Schrödinger equation play a large role in QM. They may even know that the wave function is one way of expressing the state of a system in quantum mechanics. But they usually don’t know how physicists derive wave functions in any given experiment. To understand this requires a bare-bones description of such experiments, and the best I know of is from section II of Stapp’s paper The Copenhagen Interpretation (American Journal of Physics Vol. 40), which can also be found in section 3.2 of his Mind, Matter, and Quantum Mechanics (3rd Ed.): “Quantum theory is a procedure by which scientists predict probabilities that measurements of specified kinds will yield results of specified kinds in situations of specified kinds.” Concise? Yes. Informative? Hardly. But luckily there’s more:
“First, [the experimental physicist] transforms his information about the preparation of the system into an initial wave function. Then he applies to it some linear transformation, calculated perhaps from the Schrödinger equation, or perhaps from the S matrix, which converts the initial wave function into a final wave function. This final wave function, which is built on the degrees of freedom of the measured system, is then folded into the wave function corresponding to a possible result…
The essential points are that attention is focused on some system that is prepared in a specified manner and later examined in a specified manner. Quantum theory is a procedure for calculating the predicted probability that the specified type of examination will yield some specified result. This predicted probability is the predicted limit of the relative frequency of occurrence of the specified result, as the number of systems prepared and examined in accordance with the specifications goes to infinity.“
Now to unpack the above a bit. First, notice that this “procedure” which describes how quantum theory is used involves two wave functions (or their mathematical equivalents). It may seem like the “initial” one is similar to some mathematical description you might find in classical physics: you set up your system and use math to describe it (e.g., the mathematical description of the system’s state). But why do we need another wave function in order to describe the “measured system”, why isn’t the “measured system” the same as the system we prepared, and why doesn’t either wave function give us the state of any system? Second, what’s “relative frequency” and what does “goes to infinity” mean? The key to the first question is that, in QM, when we “prepare” a system we fundamentally disturb it-repeatedly. In fact, we keep disturbing it over and over again to give us a kind of “average” value of measurements that we collectively call “preparation”, and then transcribe this into a wave function that doesn’t correspond in any known way to whatever is left after we fundamentally disturbed the “system” by “preparing it. Since we no longer know what our wave function describes in terms of anything physical, we can’t predict anything unless we can relate the wave function to a very specific kind of later “measurement”. So specific, in fact, that we need to construct another “system” which doesn’t exist. Instead, it tells us that, given we disturbed X particular system in Y specified manner, we can use Z specifications to determine the probability that X, Y, and Z will give us the same result (this is the “relative frequency” part: imagine flipping a fair coin. The probability of getting heads given a fair coin flip is not 50% because given any n such tosses we will always get heads 50% of the time but because we would if we let n go to infinity. Likewise, given Y specifications of X with Z measurement specifications n times, we figure out which outcomes occur which percent of the time if we let n go to infinity).
That, in essence, is quantum theory. The problem, though, is that at no time have we used a mathematical description of our physical system(s) that corresponds to anything physical, but rather to wave functions we’ve mathematically derived from taking a “system”, messing it up repeatedly and describing the disturbances as “systems”, and this allows us to plug these mathematical systems into equations that yield the probabilities which makes QM perhaps the most successful theory of all time.
But it gets worse. As in classical physics, we describe “measurements” in terms of “observables”, and we label most of these “observables” the same way (e.g., “momentum”, “position”, etc.). However, in QM our “systems” are mathematical entities. Clearly, we can’t measure/observe the position or momentum of a mathematical function. So “observables” in QM are actually kinds of mathematical functions called operators. Thus the “momentum operator” doesn’t tell us the momentum of anything. It tells us that given a particular system “prepared” in a particular way and “measured” in a particular way, applying the “observable” operator (function) will yield a particular value with a particular probability (i.e., we can’t even just “apply” the operator and get a value we then call momentum, because we don’t get a value).
It gets even worse. We may not know what we are describing with QM, as our “systems” are mathematical and our “observable” properties are mathematical functions, but we at least we derived the mathematical structures experimentally. It’s not as if we started out with the mathematical structure/formalism of QM and then applied it; we started out with classical physics and gradually found through experiments that reality itself appears fundamentally “unsharp” or otherwise “vague” in numerous ways such that the only way to investigate reality at this level was by describing how forcing it to be “non-vague” obeys certain probabilistic tendencies.
This is not true of QFT, and therefore the whole standard model of particle physics. Quantum mechanics is not a relativistic physics, but special relativity has been confirmed experimentally (and Einstein formulated it in 1905 based on experimental findings). QM tells us that the energy of physical systems can fluctuate incredibly wildly. Consider a positron: it is “created” e.g., by a single electron scattering off of something (like a nucleus or proton or whatever) which causes the creation of the positron. Why? Because such an electron is energetic enough for special relativity to require an “antielectron” to conserve the energy of the system, and no Schrödinger equation can describe a system in which “energy” produces (is converted to) “mass” like this. For that we need relativistic quantum mechanics, and QFT is the most complete relativistic quantum theory (hence its status as the foundation of the standard model). But how did we develop the mathematical structures that tell us under what circumstances such a positron is produced? We took the mathematics of QM, took the mathematics of special relativity, and messed around with the mathematical structure of QM until it could incorporate that from special relativity. In other words, it wasn’t like QM stopped working the way classical physics did and we needed something more which we were able to derive experimentally. After all, the so-called “Copenhagen interpretation” dates from 1927, over 20 years after Einstein published his paper on special relativity, and even before developing his famous wave equation Schrödinger tried his hand at a relativistic equation. Everybody involved in the development of QM and its extensions knew this was a problem (and not the only one). The real issue is that we “fixed” the deficiencies of QM mathematically, not through experiments. Thus things like positrons weren’t discovered empirically, but are things we needed to balance equations. This is why we predicted the Higgs Boson and searched for it for years, CERN said they found it in 2012, and the research on what we found continues: we created the standard model mostly via mathematical extensions of a theory (QM) that deals with mathematical, not physical, systems.
So, what is the relationship between physical reality and mathematics? Well, with quantum mechanics we know that the things we call physical systems aren’t physical, but at least we are clear about the procedures we use, allowing a minimalist interpretation (the orthodox position in which QM is deemed irreducibly statistical), no matter how unsatisfactory many find this to be. Most of modern “fundamental” physics consists of extensions to QM (in which almost none of the particles of the standard model can exist), which we construct mathematically, not empirically. Basically, most of modern physics that involves the nature of reality treats reality as mathematical constructs. So much so, in fact, that information theory has been increasingly used and adopted as the only suitable language, context, and “ontological” framework for modern physics, because information theory is statistical, non-physical, and perfectly suited to describing “reality” as depicted in modern physics: non-physical.