Ontologies in physics—7: To understand QM, you have to first solve yet another problem with the EM

In the last post, I mentioned the difficulty introduced by (i) the higher-dimensional nature of \Psi, and (ii) the definition of the electrostatic V on the separation vectors rather than position vectors.

Turns out that while writing the next post to address the issue, I spotted yet another issue. Its maths is straight-forward (X–XII standard). But its ontology is not at all so. So, let me mention it. (This upsets the entire planning I had in mind for QM. Anyway, read on.)


In QM, we do use a classical EM quantity, namely, the electrostatic potential energy field. It turns out that the understanding of EM which we have so painfully developed over some 4 very long posts, still may not be adequate enough.

To repeat, the overall maths remains the same. But it’s the physics—rather, the detailed ontological description—which has to be changed. And with it, some small mathematical details would change as well.

I will mention the problem here in this post, but not the solution I have in mind; else this post will become huuuuuge. (Explaining the problem itself is going to take a bit.)


Consider a hydrogen atom in an otherwise “empty” infinite space as our quantum system.

The proton and the electron interact with each other. To spare me too much typing, let’s approximate the proton as being fixed in space, say at the origin, and let’s also assume a 1D atom.

The Coulomb force by the proton (particle 1) on the electron (particle 2) is given by \vec{f}_{12} = \dfrac{(e)(-e)}{r^2} \hat{r}_{12}. (I have rescaled the equations to make the constants disappear.) The potential energy of the electron due to proton is given by: V_{12} = \dfrac{(e)(-e)}{r^2}.

The potential energy profile here is in the form of a well, vaguely looking like the letter `V’, that goes infinitely deep at the origin (at the proton’s position), and whose wings asymptotically approach zero at \pm \infty.

If you draw a graph, the electron will occupy a point-position at one and only one point on the r-axis at any instant; it won’t go all over the space. Remember, the graph is for V, which is expressed using the classical law of Coulomb’s.


In QM, the measured position of the electron could be anywhere; it is given using Born’s rule on the wavefunction \Psi(r,t).

So, we have two notions of positions for the same existent: electron.

One is the classical point-position. We use this notion even in QM calculations, else we could not get to the V function. In the classical view, the electronic position can be variable; it can go over the entire infinite domain; but it must refer to one and only one point at any given instant.

The measured position of the electron refers to the \Psi, which is a function of all infinite space. The Schrodinger evolution occurs at all points of space at any instant. So, the electron’s measured position could be found at any location—anywhere in the infinite space. Once measured, the position comes down to a single point. But before measurement, \Psi sure is (nonuniformly) spread all over the infinite domain.

Schrodinger’s solution for the hydrogen atom uses \Psi as the unknown variable, and V as a known variable. Given the form of this equation, you have no choice but to consider the entire graph of the potential energy (V) function into account.

Just a point-value of V at the instantaneous position of the classical electron simply won’t do—you couldn’t solve Schrodinger’s equation then.

If we have to bring the \Psi from its Platonic “heaven” to our 1D space, we have to treat the entire graph of V as a physically existing (infinitely spread) field. Only then could we possibly say that \Psi too is a 1D field. (Even if you don’t have this motivation, read on, anyway. You are bound to find something interesting.)

Now an issue arises.


The proton is fixed. So, the electron must be movable—else, despite being a point-particle, it is hard to think of a mechanism which can generate the whole V graph for its local potential energies.

But if the electron is movable, there is a certain trouble regarding what kind of a kinematics we might ascribe to the electron so that it generates the whole V field required by the Schrodinger equation. Remember, V is the potential energy of the electron, not of proton. By classical EM (used in the equation), V at any instant must be a point-property, not a field. But Schrodinger’s equation requires a field for V. So, the only imaginable solutions are weird: an infinitely fast electron running all over the domain but lawfully (i.e. following the laws at every definite point). Or something similarly weird.

So, the problem (how to explain how the V function, used in Schrodinger’s equation) still remains.


In the textbook treatment of EM (and I said EM, not QM), the proton does create its own force-field, which remains fixed in space (for a spatially fixed proton). The proton’s \vec{E} field is spread all over the infinite space, at any instant. So, why not exploit this fact?

The potential field of a proton (in volt) is denoted as V in EM texts. So, to avoid confusion with the potential energy function (in joule) of the electron, let’s denote the proton’s potential using the symbol P.

The potential field P does remain fixed and spread all over the space. But the trouble is this:

It is also positive everywhere. Its graph is not a well, it is a peak—infinitely tall peak at the proton’s position, asymptotically approaching zero at \pm \infty, and positive (above the zero-line) everywhere.

Therefore, you have to multiply this P field by the negative charge of electron e so that P turns into the required V field of the electron.

But nature does no multiplications—not unless there is a definite physical mechanism to “convert” the quantities appropriately.

For multiplications with signed quantities, a mechanism like the mechanical lever could be handy. One small side goes down; the other big side goes  up but to a different extent; etc. Unfortunately, there is no place for a lever in the EM ontology—it’s all point charges and the “empty” space, which we now call the aether.

Now, if multiplication of constant magnitudes alone were to be a problem, we could have always taken care of it by suitably redefining P.

But the trouble caused by the differing sign still remains!

And that’s where the real trouble is. Let me show you how.


If a proton has to have its own P field, then its role has to stay the same regardless of the interactions that the proton enters into. Whether a given proton interacts with an electron (negatively charged), or with another proton (positively charged), the given proton’s own field still has to stay the same at all times, in any system—else it will not be its own field but one of interactions. It also has to remain positive by sign (even if P is rescaled to avoid multiplications).

But if V has to be negative when an electron interacts with it, and if V also has to be positive when another proton interacts with it, then a multiplication by sign, too, must occur. You just can’t avoid multiplications.

But there is no mechanism for the multiplications mandated by the sign conversions.

How do we resolve this issue?


Here is one way out that we might think of.

We say that a proton’s P field stays just the same (positive, fixed) at all times. However, when the second particle is positively charged, then it moves away from the proton; when the second particle is negatively charged, then it moves towards the proton.

So, the direction of the motion of the forced particle is not determined only by the field (which is always positive here), but also by the polarity of that particle itself. And, it’s a simple change, you might argue. There is some unknown physics to the charge, you could say, which propels it this way instead of that way, depending on its own sign.

Thus, charges of opposing polarities go in opposite directions while interacting with the same proton. That’s just how charges interact with fields. By definition. You could say that.

What could possibly be wrong with that view?

Well, the wrong thing is this:

If you imagine a classical point-particle of an electron as going towards the proton at a point, then a funny situation ensues while using it in QM.

The arrows depicting the force-field of the proton always point away from it—except for the one distinguished position, viz., that of the electron, where a single arrow would be found pointing towards the proton (following the above suggestion).

So, the action of the point-particle of the electron introduces an infinitely sharp discontinuity in the force-field of the proton, which then must also seep into its V field.

But a discontinuity like that is not basically compatible with Schrodinger’s equation. It will therefore lead to one of the following two consequences:

It might make the solution impossible or ill-defined. (I don’t know enough about maths to tell if this could be true). Or, as an alternative, if a solution is possible (including solutions that are asymptotic or approximate but good enough) then the presence of the discontinuity will sure have an impact on the solution. The calculated \Psi wouldn’t be the same as that for a V without the discontinuity.

Essentially, we have once come back to a repercussion of the idea that the classical electron has a point position, but its potential energy field in the interaction with the proton is spread everywhere.

To fulfill our desire of having a 3D field for \Psi, we have to have a certain kind of field for V. But V should not change its value in just one isolated place, just in order to allow multiplication by -1, because doing so introduces discontinuity. It should remain the same smooth V that we have always seen.


So, here is the problem statement, in general terms:

To find a physically realizable way such that: even if we use the classical EM properties of the electron while calculating V, and even if the electron is classically a point-particle, its V function (in joules) should still turn out to be negative everywhere—even if the proton has its own potential field (P, in volts) that is positive everywhere in the classical EM.

In short, we have to change the way we look at the physics of the EM fields, and then also make the required changes to any maths, as necessary. Without disturbing the routine calculations either in EM or in QM.

Can it be done? Well, I think “yes.”

While I’ve been having some vague sense of there being some “to be looked into” issue for quite some time (months, at least), it was only over the last week, especially over the last couple of days (since the publication of the last post), that this problem became really acute. I always used to skip over this ontology/physics issue and go directly over to using the EM maths involved in the QM. I used to think that the ontology of such EM as is used in the QM, would be pretty easy to explain—at least as compared to the ontology of QM. Looks like despite spending thousands of words (some 4–5 posts with a total of may be 15–20 K words) there still wasn’t enough of a clarity—about EM.

Fortunately, the problem did become clear. Clear enough that, I think, I also found a satisfactory enough solution to it too. Right today (on 2019.10.15 evening).

Would you like to give it a try? (I anyway need a break. So, take about a week’s time or so, if you wish.)


Bye for now, take care, and see you the next time.


A song I like:

(Hindi) “jaani o jaani”
Singer: Kishore Kumar
Music: Laxmikant-Pyarelal
Lyrics: Anand Bakshi

Advertisements

Ontologies in physics—6: A basic problem: How the mainstream QM views the variables in Schrodinger’s equation

1. Prologue:

From this post, at last, we begin tackling quantum mechanics! We will be covering those topics from the physics and maths of it which are absolutely necessary from developing our own ontological viewpoint.

We will first have a look at the most comprehensive version of the non-relativistic Schrodinger equation. (Our approach so far has addressed only the non-relativistic version of QM.)

We will then note a few points concerning the way the mainstream physics (MSMQ) de facto approaches it—which is remarkably different from how engineers regard their partial differential equations.

In the process, we will come isolate and pin down a basic issue concerning how the two variables \Psi and V from Schrodinger’s equation are to be seen.

We regard this issue as a problem to be resolved, and not as just an unfamiliar kind of maths that needs no further explanation or development.

OK. Let’s get going.


2. The N-particle Schrodinger’s equation:

Consider an isolated system having 3D infinite space in it. Introduce N number of charged particles (EC Objects in our ontological view) in it. (Anytime you take arbitrary number of elementary charges, it’s helpful to think of them as being evenly spread between positive and negative polarities, because the net charge of the universe is zero.) All the particles are elementary charges. Thus, -|q_i| = e for all the particles. We will not worry about any differences in their masses, for now.

Following the mainstream QM, we also imagine the existence of something in the system such that its effect is the availability of a potential energy V.

The multi-particle time-dependent Schrodinger equation now reads:

i\,\hbar \dfrac{\partial \Psi(\vec{R},t)}{\partial t} = - \dfrac{\hbar^2}{2m} \nabla^2 \Psi(\vec{R},t) + V(\vec{R},t)\Psi(\vec{R},t)

Here, \vec{R} denotes a set of particle positions, i.e., \vec{R} = \lbrace \vec{r}_1, \vec{r}_2, \vec{r}_3, \dots, \vec{r}_N \rbrace. The rest of the notation is standard.


3. The mainstream view of the wavefunction:

The mainstream QM (MSMQ) says that the wavefunction \Psi(\vec{R},t) exists not in the physical 3-dimensional space, but in a much bigger, abstract, 3N-dimensional configuration space. What do they mean by this?

According to MSQM, a particle’s position is not definite until it is measured. Upon a measurement for the position, however, we do get a definite 3D point in the physical space for its position. This point could have been anywhere in the physical 3D space spanned by the system. However, measurement process “selects” one and only one point for this particle, at random, during any measurement process. … Repeat for all other particles. Notice, the measured positions are in the physical 3D.

Suppose we measure the positions of all the particles in the system. (Actually, speaking in more general terms, the argument applies also to position variables before measurement concretizes them to certain values.)

Suppose we now associate the measured positions via the set \vec{R} = \lbrace \vec{r}_1, \vec{r}_2, \vec{r}_3, \dots, \vec{r}_N \rbrace, where each \vec{r}_i refers to a position in the physical 3D space.

We will not delve into the issue of what measurement means, right away. We will simply try to understand the form of the equation. There is a certain issue associated with its form, but it may not become immediately apparent, esp. if you come from an engineering background. So, let’s make sure to know what that issue is:

Following the mainstream QM, the meaning of the wavefunction \Psi is this: It is a complex-valued function defined over an abstract 3N-dimensional configuration space (which has 3 coordinates for each of the N number of particles).

The meaning of any function defined over an abstract 3ND configuration space is this:

If you take the set of all the particle positions \vec{R} and plug them into such a function, then it evaluates to some single number. In case of the wavefunction, this number happens to be a complex number, in general. (Remember, all real numbers anyway are complex numbers, but not vice-versa.) Using the C++ programming terms, if you take real-valued 3D positions, pack them in an STL vector of size N, and send the vector into the function as an argument, then it returns just one specific complex number.)

All the input arguments (the N-number of 3D positions) are necessary; they all taken at once produce the value of the function—the single number. Vary any Cartesian component (x, y, or z) for any particle position, and \Psi will, in general, give you another complex number.

Since a 3D space can accommodate only 3 number of independent coordinates, but since all 3N components are required to know a single \Psi value, it can only be an abstract entity.

Got the argument?

Alright. What about the term V?


4. The mainstream view of V in the Schrodinger equation:

In the mainstream QM, the V term need not always have its origin in the electrostatic interactions of elementary point-charges.

It could be any arbitrary source that imparts a potential energy to the system. Thus, in the mainstream QM, the source of V could also be gravitational, magnetic, etc. Further, in the mainstream QM, V could be any arbitrary function; it doesn’t have to be singularly anchored into any kind of point-particles.

In the context of discussions of foundations of QM—of QM Ontology—we reject such an interpretation. We instead take the view that V arises only from the electrostatic interactions of charges. The following discussion is written from this viewpoint.

It turns out that, speaking in the most fundamental and general terms, and following the mainstream QM’s logic, the V function too must be seen as a function that “lives” in an abstract 3ND configuration space. Let’s try to understand a certain peculiarity of the electrostatic V function better.

Consider an electrostatic system of two point-charges. The potential energy of the system now depends on their separation: V = V(\vec{r}_2 - \vec{r}_1) \propto q_1q_2/|\vec{r}_2 - \vec{r}_1|. But a separation is not the same as a position.

For simplicity, assume unit positive charges in a 1D space, and the constant of proportionality also to be 1 in suitable units. Suppose now you keep \vec{r}_1 fixed, say at x = 0.0, and vary only \vec{r}_2, say to x = 1.0, 2.0, 3.0, \dots, then you will get a certain series of V values, 1.0, 0.5, 0.33\dots, \dots.

You might therefore be tempted to imagine a 1D function for V, because there is a clear-cut mapping here, being given by the ordered pairs of \vec{r}_2 \Rightarrow V values like: (1.0, 1.0), (2.0, 0.5), (3.0, 0.33\dots), \dots. So, it seems that V can be described as a function of \vec{r}_2.

But this conclusion would be wrong because the first charge has been kept fixed all along in this procedure. However, its position can be varied too. If you now begin moving the first charge too, then using the same \vec{r}_2 value will gives you different values for V. Thus, V can be defined only as a function of the separation space \vec{s} = \vec{r}_2 - \vec{r}_1.

If there are more than two particles, i.e. in the general case, the multi-particle Schrodinger equation of N particles uses that form of V which has N(N-1) pairs of separation vectors forming its argument. Here we list some of them: \vec{r}_2 - \vec{r}_1, \vec{r}_3 - \vec{r}_1, \vec{r}_4 - \vec{r}_1, \dots, \vec{r}_1 - \vec{r}_2, \vec{r}_3 - \vec{r}_2, \vec{r}_4 - \vec{r}_2, \dots, \vec{r}_1 - \vec{r}_3, \vec{r}_2 - \vec{r}_3, \vec{r}_4 - \vec{r}_1, \dots, \dots. Using the index notation:

V = \sum\limits_{i=1}^{N}\sum\limits_{j\neq i, j=1}^{N} V(\vec{s}_{ij}),

where \vec{s}_{ij} = \vec{r}_j - \vec{r}_i.

Of course, there is a certain redundancy here, because the s_{ij} = |\vec{s}_{ij}| = |\vec{s}_{ji}| = s_{ji}. The electrostatic potential energy function depends only on s_{ij}, not on \vec{s}_{ij}. The general sum formula can be re-written in a form that avoids double listing of the equivalent pairs of the separation vectors, but it not only looks a bit more complicated, but also makes it somewhat more difficult to understand the issues involved. So, we will continue using the simple form—one which generates all possible N(N-1) terms for the separation vectors.

If you try to embed this separation space in the physical 3D space, you will find that it cannot be done. You can’t associate a unique separation vector for each position vector in the physical space, because associated with any point-position, there come to be an infinity of separation vectors all of which have to be associated with it. For instance, for the position vector \vec{r}_2, there are an infinity of separation vectors \vec{s} = \vec{a} - \vec{r}_2 where \vec{a} is an arbitrary point (standing in for the variable \vec{r}_1). Thus, the mapping from a specific position vector \vec{r}_2 to potential energy values becomes an 1: \infty mapping. Similarly for \vec{r}_1. That’s why V is not a function of the point-positions in the physical space.

Of course, V can still be seen as proper 1:1 mapping, i.e., as a proper function. But it is a function defined on the space formed by all possible separation vectors, not on the physical space.

Homework: Contrast this situation from a function of two space variables, e.g., F = F(\vec{x},\vec{y}). Explain why F is a function (i.e. a 1:1 mapping) that is defined on a space of position vectors, but V can be taken to be a function only if it is seen as being defined on a space of separation vectors. In other words, why the use of separation vector space makes the V go from a 1:\infty mapping to a 1:1 mapping.


5. Wrapping up the problem statement:

If the above seems a quizzical way of looking at the phenomena, well, that precisely is how the multi-particle Schrodinger equation is formulated. Really. The wavefunction \Psi is defined on an abstract 3ND configuration space. Really. The potential energy function V is defined using the more abstract notion of the separation space(s). Really.

If you specify the position coordinates, then you obtain a single number each for the potential energy and the wavefunction. The mainstream QM essentially views them both as aspatial variables. They do capture something about the quantum system, but only as if they were some kind of quantities that applied at once to the global system. They do not have a physical existence in the 3D space-–even if the position coordinates from the physical 3D space do determine them.

In contrast, following our new approach, we take the view that such a characterization of quantum mechanics cannot be accepted, certainly not on the grounds as flimsy as: “That’s just how the math of quantum mechanics is! And it works!!” The grounds are flimsy, even if a Nobel laureate or two might have informally uttered such words.

We believe that there is a problem here: In not being able to regard either \Psi or V as referring to some simple ontological entities existing in the physical 3D space.

So, our immediate problem statement becomes this:

To find some suitable quantities defined on the physical 3D space, and to use them in such a way, that our maths would turn out to be exactly the same as given for the mainstream quantum mechanics.


6. A preview of things to come: A bit about the strategy we adopt to solve this problem:

To solve this problem, we begin with what is easiest to us, namely, the simpler, classical-looking, V function. Most of the next post will remain concerned with understanding the V term from the viewpoint of the above-noted problem. Unfortunately, a repercussion would be that our discussion might end up looking a lot like an endless repetition of the issues already seen (and resolved) in the earlier posts from this series.

However, if you ever suspect, I would advise you to keep the doubt aside and read the next post when it comes. Though the terms and the equations might look exactly as what was noted earlier, the way they are rooted in the 3D reality and combined together, is new. New enough, that it directly shows a way to regard even the \Psi field as a physical 3D field.

Quantum physicists always warn you that achieving such a thing—a 3D space-based interpretation for the system-\Psi—is impossible. A certain working quantum physicist—an author of a textbook published abroad—had warned me that many people (including he himself) had tried it for years, but had not succeeded. Accordingly, he had drawn two conclusions (if I recall it right from my fallible memory): (i) It would be a very, very difficult problem, if not impossible. (ii) Therefore, he would be very skeptical if anyone makes the claim that he does have a 3D-based interpretation, that the QM \Psi “lives” in the same ordinary 3D space that we engineers routinely use.

Apparently, therefore, what you would be reading here in the subsequent posts would be something like a brand-new physics. (So, keep your doubts, but hang on nevertheless.)

If valid, our new approach would have brought the \Psi field from its 3N-dimensional Platonic “heaven” to the ordinary physical space of 3 dimensions.

“Bhageerath” (भगीरथ) [^] ? … Well, I don’t think in such terms. “Bhageerath” must have been an actual historical figure, but his deeds obviously have got shrouded in the subsequent mysticism and mythology. In any case, we don’t mean to invite any comparisons in terms of the scale of achievements. He could possibly serve as an inspiration—for the scale of efforts. But not as an object of comparison.

All in all, “Bhageerath”’s deed were his, and they anyway lie in the distant—even hazy—past. Our understanding is our own, and we must expend our own efforts.

But yes, if found valid, our approach will have extended the state of the art concerning how to understand this theory. Reason good enough to hang around? You decide. For me, the motivation simply has been to understand quantum mechanics right; to develop a solid understanding of its basic nature.

Bye for now, take care, and sure join me the next time—which should be soon enough.


A song I like:

[The official music director here is SD. But I do definitely sense a touch of RD here. Just like for many songs from the movie “Aaraadhanaa”, “Guide”, “Prem-Pujari”, etc. Or, for that matter, music for most any one of the movies that the senior Burman composed during the late ’60s or early ’70s. … RD anyway was listed as an assistant for many of SD’s movies from those times.]

(Hindi) “aaj ko junali raat maa”
Music: S. D. Burman
Singer: Lata Mangeshkar, Mohammad Rafi
Lyrics: Majrooh Sultanpuri


History:
— First published 2019.10.13 14:10 IST.
— Corrected typos, deleted erroneous or ill-formed passages, and improved the wording on home-work (in section 4) on the same day, by 18:29 IST.
— Added the personal comment in the songs section on 2019.10.13 (same day) 22:42 IST.

 

 

A series of posts on a few series of tweets (by me) on (my research on foundations of) QM—1

0. Initial remarks:

OK. It’s been a little while since I wrote my last post here.

Actually, it so happened that for a while after my last post I didn’t find anything well suited for writing a blog-post. I was also busy studying topics from Data Science. It’s true that during this time I did make a few comments at others’ blogs, but these were pretty context-specific. I couldn’t easily think of making a (more general-purpose) post out of them.

At the same time, some of the things that I read on QM—whether in pop-sci books or at others’ blogs—did prompt me to note a few comments. These were very brief points. They were better fitting only as tweets—as side-remarks made in the passing. So, I tweeted them. My twitter page is here [^].

… I now realize that quite a few of such tweets (on QM) have got accumulated. So it’s high time that these occasional notings got moved here too, together with some explanation to go with them. That’s precisely what I am going to do now, in this series of posts.

Most of these points (from the tweets) refer to my Outline document on QM which was posted at iMechanica about 6 months ago [^]. The tweets wouldn’t make any sense to someone if he hasn’t thoroughly gone through this document first. So, I do assume this context here.

In fact, most of these tweets are rather direct implications of what I had already noted in the Outline document. These points (from the tweets) were quite clear to me even back then, when I wrote the document.

However, while writing that document, my purpose was, first and foremost, to state the most salient building blocks and points of the theory and to focus on the overall way in which they connect together. Thus, what I wanted to give, via that document, was a definitive sense of the overall framework—hopefully in a logically complete manner. I was in fact worried a bit that some parts of these complex considerations might get slipped out of my mind once again as they had done in the past (before I wrote that document!) [In retrospect, I think that on this count, I did a pretty good job in the Outline document. I haven’t been able to think of a really essential part of the framework which I had in mind and which inadvertently got left out from it.]

Another reason I didn’t go into detailed implications right in that document was this: I also thought that anyone who knows the mainstream QM well, and also “gets” the logic given in my document well, would be able to very easily reach these further inferences completely on his own—for instance, my position on the wave-particle duality. So, I didn’t separately mention such points in that document even if I knew that points like these would be of  much greater interest to the layman. The Outline, although very simple it looks, was definitely not written for the layman. (I tried to keep it as simple in exposition as possible, in part because I didn’t care to be seen as a respectable physicist anyway. All that I was concerned about was QM, and the new conceptual framework.)

So, all in all, it’s not an accident that I should be touching on many points like the wave-particle duality only later on, first via tweets! These really are only implications / consequences.

Anyway, here in this series we now go with these tweets of mine (made over the past month). While reproducing them here, I have expanded the short-forms or abbreviations, and also have added few additional bits of content too, just to get more streamlined sentences. Each tweet is then followed by some explanation, which very rapidly became very long—long enough that I couldn’t possibly compress all the QM-related material (tweets and my explanations of them) into a single post. So, I have no choice but to make a series of them!


1. Schemes for nonlinear QM proposed by others:

I tweeted on 12 July 2019 to this effect:

“Schrödinger eqn. revisited” by Schleich et al.  [(.PDF) ^] . Yes, it presents a nonlinearity. But no, it doesn’t even consider the physical fact that all the potentials in reality come about only from superpositions of the singular potentials of individual electrons and protons. See my Outline document.”

Indeed, what I said here applies to each and every nonlinearity-based argument (except for mine!) which has ever been offered by way of attempting a resolution to the riddles of QM—in particular, the measurement problem.

To quote from Ian Stewart’s book: “Does God Play Dice? (2/e)”, several people have proposed nonlinear theories, including:

“L. Diosi, N. Gisin, G. C. Ghirardi, R. Grassi, P. Pearle, A. Rimini, and I. Percival.”

I had very briefly gone through some of these proposals. Actually, I had mostly got to know about their proposals by reading descriptions and remarks on them as made by other commentators. However, at times, I also went rapidly browsing through some of their arXiv papers. I had come to the conclusion that what they were putting forth wasn’t anything like my ideas (later mentioned in the Outline document). To quote Stewart here,

“In all of these theories the interaction of a quantum system with its environment produces an irreversible change that turns the quantum state into an eigenstate. However, all of these theories are probabilistic: the initial quantum state undergoes a kind of random diffusion which ultimately leads to an eigenstate.” (ibid.)

To be honest, I am not sure whether all these proposals could be characterized as involving random diffusion or not. I don’t know these theories to the required level of detail to be able to confirm or deny Stewart’s characterization. However, there certainly is this element of an initial quantum state getting collapsed precisely to the measured eigenstate, which appears in all of them—and I don’t accept that idea in the first place (as explicitly put forth in my Outline document).

In a slightly different context, Stewart also notes:

“There is some interest among physicists in what they call `quantum chaos’, but quantum chaos is about the relation between non-chaotic quantum systems and chaotic classical approximations—not chaos as a mechanism for quantum indeterminacy.” (ibid.)

OK, this is one conclusion which I very distinctly remember I had reached on my own too. I guess this was in November 2018, when I had googled on “quantum chaos.” Subsequently, I re-checked the matter again (just to be sure) in February ’19 (i.e., just days after posting my Outline document.)

I agree that Stewart’s characterization is right on the target here. IMHO, you don’t need to take recourse to the prior studies of “quantam chaos” very seriously if either the QM foundations or the very feasibility of the quantum computer are your concerns.


2. A bit on my PhD-time research:

I made a series of 4 tweets on 18 July 2019. The first two of these dealt with my old, PhD time approach to photon propagation. Let me note here a clarification regarding what all other work I had performed during my PhD, before coming to my old (PhD-time) approach to QM (which I will address in the next section).

The first thing to note is that my work on QM had formed only a part (may be about 1/4th part or so) of whatever studies and research I had done during my PhD.

The other parts of my PhD thesis were notably related to the studies of the classical second-order partial-differential equations, and their computational modeling using stochastic processes. The equations on which I thus focused my attention were: the Helmholtz equation, the diffusion equation, and the Poisson-Laplace equation. In addition, I had also picked up a study of elasticity, and had added a conjecture about the possible applicability of some random-walks type of processes for modeling the classical tensor fields (of stresses and strains as used in engineering). Let me go over all these topics in brief.

2.1 Work on the diffusion equation:

I think I have posted many entries at this blog about my work on diffusion equation. So let me not regress into it all once again. Let me just note that I basically showed that, contrary to what post-graduate texts in maths (published by AMS) say, the diffusion equation does not necessarily imply an instantaneous action at a distance (IAD).

The IAD in diffusion, I pointed out, was an outcome of the features of the solution theory (Fourier’s theory, and also of Einstein’s analysis of the Brownian movement). But IAD was not necessarily implied either by the local physics of diffusion phenomena, or by the partial differential equation that is the diffusion equation. [Here, remember, a differential equation always, invariably, necessarily, etc., is local in nature—it refers to an infinitesimal CV (control volume) or CM (control mass).]

In particular, I pointed out that the compactness of the support of the solution was the crucial issue here—whether the support was infinite (as in Fourier theory and in 2nd half of Einstein’s c. 1905 paper), or finite (as in any subdomain-based numerical method, or in the Brownian movement, i.e., the first half of the same paper by Einstein). In my view of the things, you can always transition from a collection of finite subdomains to an infinity of infinitesimal CVs that are still distributed over only a finite interval, via a suitable limiting process. The finite support, of course, could grow in extent with time.

These observations had never been made in about 200 years of the existence of Fourier’s theory. (Go ahead, hunt for the precedents!) You have to make this distinction between a (local) PDE and its (possibly global) solutions obtained after conducting integration operations, and in this entire process, you have to be careful about not elevating a mere ansatz or an integration method to the high pedestal of “the” (provably unique) solution. That’s in effect what I had argued.

2.2 Work on the diffraction phenomenon (Huygens-Fresnel theory):

I also had a neat (though smallish) result concerning the obliquity factor in diffraction. I went through Huygens’, Fresnel’s and Kirchhoff’s analyses of the diffraction phenomenon (involving the Helmholtz equation—i.e., the spatial part of the wave PDE), and then pointed out the reasons why the obliquity factor could not be regarded an essential characteristic of the diffraction phenomenon itself.

Once again, the obliquity factor turned out to be a feature of how the analysis—specifically, the integration operations—had been set up. It was a feature of the mathematical solution procedure adopted for this problem. In diffraction, there was no fundamental physical process which operated in an anisotropic way, compelling the wavefield to have a greater amplitude in the forward direction and zero in the backward direction.

However, explanations for some 187 years (since Fresnel’s work) had characterized diffraction as an inherently anisotropic phenomenon. Yes, right up to my old copy of Resnick & Halliday. There was a surprise in it for me because while Fresnel was just a railroad engineer who had taught himself maths, Kirchhoff surely was a master of PDEs and their integration techniques. But this fact still had escaped even Kirchhoff.

I pointed out how even if you do keep isotropy to the Huygens’ wavelets, given the geometry of the interaction of Huygens’ wavelets and the surfaces where BCs are applied, you would still end up with the same amplitudes as those obtained by Fresnel’s or Kirchhoff’s analyses.

Come to think of it, you could even pick up this line of argument and apply it to any analysis that seeks to derive an expression for a field inside a finite domain by appeal to a pair of forward- and backward-going processes occurring within that domain; e.g., an analysis involving the advanced and retarded waves, or the transactional waves in certain interpretations of QM, etc. You just have to be careful about what BCs and integrals are being set up and how the integration processes are being conducted, that’s all!

2.3 Computational modeling of transient heat conduction:

I then tried to apply the random walks-based approach (RW) to model transients in the heat conduction, as they occur in a moving boundary problem, viz. the melting snowman. Since my focus was on conduction, I grossly simplified all the other aspects of this problem. (Having just come out of an illness, I would get easily tired back then.) The problem I considered was that of melting of a snowman.

Consider a snowman in the form of a vertical right-circular cylinder which is placed on a relatively large block of ice below. The snowman absorbs heat from the atmosphere by radiation and convection at its external surfaces. The absorbed thermal energy then flows through the volume of the cylinder to the relatively large block of ice underneath (which was regarded as infinitely large in the simulations). The temperature gradients of course come to exist. The heat in the atmosphere brings the external surface to the melting point of ice even as the interior portions remain below it. So, the surface melts—phase-transition ensures a constancy of temperature at the surface. The melting is more pronounced at the sharp corners. The resulting water gradually slips down, forming a thin and continuous layer on the external surface. (I ignored the fluid flow in my simulation.) All in all, the sharp cylindrical snowman slowly acquires a thumb-like shape over a period of time, and then still continues to shrink down in size.

I first tried to apply RW for heat conduction in this scenario, but soon found that there was a great deal of noise due to randomness. So, I set up a “conversion” from the particles-based approach of RWs to a local, continuum-based approach, thus ending up with a description which was essentially equivalent to a cellular automata-based one. I then performed the simulations with this CA-based approach (in 3D), compared the changing external contours of the melting snowman with an actual experiment (done at home, for less than Rs. 200/- as the total cost—for thermocouple wires, basically), and presented a paper at an international conference.

This piece of work added the necessary component of “engineering” and “experimentation” to my thesis. While my guide was always happy with my progress, he also was a bit worried that examiners might look at my thesis and conclude that it was all a useless piece of theoretical, almost scientific work—it had little “practical” component to it, and so, couldn’t qualify for a degree in engineering. So, he was quite relieved when I discussed this idea of snowman with him—he immediately gave me a go ahead!

2.4 Conjecture for using RWs for modeling tensor fields:

Then, in addition, I also had this conjecture regarding the feasibility of random walks for simulating tensor fields. Since I haven’t spoken at length about it here at this blog, let me note something here.

There were certain rigorous mathematical arguments (coming from Ivy League professors of mechanics as well as from seemingly competent but obscure Russian authors) which had purportedly shown that stochastic processes like random walks could provably not be used for simulating the stress/strain fields.

Yet, I was confident of my conjecture, out of some basic considerations which I had in mind. So I gave a conference presentation on it (in an international conference on mathematics), and also included it in my thesis.

Much later on (after my PhD defence), I grew further confident that this conjecture should definitely come to hold; that it could be proved. That is to say, the earlier (intricate) proofs by reputed mechanicians / mathematicians could be shown to have holes in them. (Not that my argument was flawless either. A professor had spotted a weak link in my argument at that conference, and had brought it to my attention in a most gentle, indirect manner.)

Then, some time still later on, I ran into some “simple” but directly useful work by a young Chinese author (perhaps a PhD student). If I remember it right, he had published this paper while working in China itself. His work was similar to an intermediate step I had in mind, but it was much more complete, even neat. No, he was not concerned with the random walks as such. All that he did was to give a working model for constructing stress/strain fields, by starting with a finite 3D unit cell having an internal structure of a truss and treating it as if it were a finite approximation for an infinitesimal CV of the continuum. I had somewhat similar ideas, and had in fact inserted a couple of screen-shots of the truss-based simulations I had conducted for a preliminary study. But he had gone much further. If I recall his paper right, he had even arrived at the right values for the truss-related parameters (like stiffnesses of the members) if this unit cell was to converge to the continuum equations of elasticity in the limit of vanishing size.

Now, by regarding the process of re-distribution of forces along the truss members as an abstract flow, and by randomizing it (discretizing it in the process), it should be easily possible to come to a proof of my conjecture. Also a neat computational simulation. Of course, the issue is not as simple as it looks on the surface. Free surfaces in a multiply-connected domain pose a tricky issue—they deform freely, and so, uniqueness becomes tricky to handle. Even then, with sufficient care (or appeal to ideas from CoV) I am sure that it can be done.

OK. I will do it some other time in future! (This has been a TBD paper on my list for almost a decade or so by now; I simply don’t run into suitable ME/MTech students for me to guide on this topic! … Anyway, this blog is in copyright, just in case you didn’t notice it…)


3. My PhD-time work on QM (photon propagation):

Alright, finally we come to my PhD-time work on photons propagation. In a series of tweets, I said (on 18 July 2019):

“1/4. My old (PhD-time) approach, then called “new approach” and also as FAQ (Fields As Quanta): I’ve abandoned it; the one in the Outline document replaces it completely. FAQ anyway dealt with only the propagation of only the photons, not their generation or absorption (i.e. it didn’t deal with the creation/annihilation operators). FAQ didn’t deal with the propagation of other particles, viz., electrons, protons, or neutrons either.”

and

“2/4. FAQ still remains valid as an abstract description, as referring to the propagation characteristics of photons in the limit that the medium is continuous (i.e., it is homogenized from discrete and dispersed atomic nuclei), i.e., if the propagation dynamics is diffusive, not ballistic.”

About this second tweet, I subsequently had second thoughts soon after, and so I noted, right on the next day (on 19 July 2019) the following comment (a reply) to it:

“Umm… I am not sure precisely what all considerations should enter into taking the limits (for arriving at the propagation characteristics of photons as conceptualized in my older, PhD-time, approach). Would have to work through how the Schrodinger formalism (and hence my new approach) goes from \Psi and photons to the classical, dynamical EM fields. To be done in future. But yes, FAQ dynamics *was* diffusive, that’s for certain.”

Thus, I first said that FAQ still remains valid, when seen as an abstract description. However, just one day later, I also pointed out the more basic and possibly tricky issues there might be—viz., finding the right kind of limiting processes which start from the Schrodinger formalism and end up at Maxwell’s equations.

I feel confident that people must have thrashed out this topic (TDSE \Rightarrow EM) long time ago. It’s just that I myself have never studied the topic so far (in fact I haven’t even done the literature search on it), and so, I don’t have a good idea about what all technical issues might get involved in it.

Thus, I will have to first study this topic (from the mainstream QM to EM). Only then would I be able to understand the mapping well enough that I could understand the Hertzian waves right in the QM settings. It’s only after this stage that I will be I be able to say something definitively about the manner in which FAQ can really hold, and if yes, how well. Worrying about the right kind of a limiting procedure would be just a part of it, but an important one. … So yes, you can take these particular tweets with a pinch of salt.


4. How did I get to my old PhD-time approach for photons (i.e. FAQ), in the first place?

OK. Now that we are at it, here is a question that might have arisen in your mind: If I didn’t know QM well back then (during my PhD-studies days), then how could I dare propose this approach (viz. FAQ) so confidently?

Ummm… Let’s leave the daring and the confidence parts aside for now. Let’s focus on the “how” of it—how I got to my ideas. This part is much more interesting. At least to me.

How precisely did I end up at the idea of FAQ?

Well, I began with a kind of a “correspondence principle” (not in the Copenhagen sense of the term; read on). Briefly, the “correspondence” which I had in mind was the fact that single photons one-at-a-time mark only isolated dots on the CCD surface, but in the large-flux situations, their density pattern converges to the continuum interference pattern as described by Young.

So, I imagined a point-source emitting photons. Mind you, photons for me were, back then, spatially discrete particles of light, a la Einstein and Feynman—both their ideas had held a tremendous sway over my thinking back then.

I then imagined an ideal absorber in the form of a spherical surface kept at some distance from the source, somewhat like your usual Gaussian surface from electrostatics, but the difference here was that while the Gaussian surface is imaginary and allows anything to move through it freely, here, it was an actual absorber, albeit imaginary. This spherical surface was centered on the same point source. I asked myself what kind of variations in density should light show, in the continuum description, on this concentric spherical surface if its radius was varied a bit. In essence, I was developing my logic by starting from Gauss’ theorem and the Poisson-Laplace equation.

I then transitioned, in my ideas, to the Helmholtz equation by imagining a time-steady waviness to the field. Now, if the radius of the sphere were constrained to be an integral multiple of the spatial period (i.e. wavelength) of light, then the total quantity of photons being absorbed at the spherical surface should remain the same for a sphere of any such a radius. The only rationale which could justify this assumption was: to have a conservation principle in place, by asserting that photons are conserved while they still are in transit through the empty space (i.e. before they get absorbed on the spherical surface). Again, remember, I was using the idea of photons as if they were spatially discrete particles, like the grains of mustard seed.

Conservation principles are neat, I had learnt mostly in reference to the ample evidence I found in engineering sciences. (Even if I were to know about Noether’s theorem, I would have disregarded it—such was, and still is, my temperament. I think that this theorem is merely a reformulation of a very narrow range of physics—one that is restricted to merely 2nd-order linear PDEs. Anyway, read on…)

If the photon number conservation was to be had in theory (during propagation) at integral multiples of \lambda for the radius of the sphere, then was there any sound reason to give up conservation when the radius was (n+1/2)\lambda? (Here I am assuming that at zero radius, the light has the maximum amplitude.) Couldn’t we explain the complete darkness at these odd radii by positing that the photon was still there—it’s just that the sphere of that particular radius didn’t absorb it? After all, we could always posit a variable called the absorption fraction which would be related to the local amplitude of the spatial wave, right? That’s how I decided to conserve the photon number, and thereby, shift the burden of the variable levels of brightness at the absorber by appeal to a photon-absorption process that varied in efficiency precisely in response to the local wave amplitude associated with the tiny grain which was photon. (I regarded this grain as a localized condition in the luminiferous aether.)

Now, the next question was: If the photons had a ballistic dynamics (i.e. a straight-line motion), then the point on the spherical surface where a given photon eventually would land, would have already been determined right at the source point—some internal processes in the emitter material would be responsible for ejecting it at random orientations, which would also determine its landing location. (Dear Bohmians, do you see something familiar? However, please note, this was entirely my own thinking. I had not come across Bohm back then. Please read on.)

I thought that while this was possible, it was also possible that the photons could also undergo random-walks. How did I introduce random walks?

Well, the direct experimental evidence showed that this propagation problem had two essential features: (i) many discrete spots which go in a limit to a continuous pattern of finite densities, and (ii) random locations on the absorber surface where the grainy photons land, i.e., no correlation between the two points where any two successive photons get absorbed.

Since the continuum viewpoint of light (Young’s waves) had to be reached in the limit, it was important to keep in mind always. It was here that I happened to recall Huygens’ principle. I was also quite at home with the idea of randomly intersecting a 3D surface with a linear probe—I had already studied stereology at the University of Alabama at Birmingham (UAB).

Huygens’ principle involved every point of space as if it were some kind of a “source” for the new (Huygens’) wavelets. The Young pattern could be obtained by superposing all the Huygens’ wavelets. The discrete spots could be had by dividing the surface of the Huygens wavelets and taking the individual surface patches to vanishing size (a la mesh refinement). This satisfactorily addressed the first essential feature noted above (viz. discrete spots). As to the second feature (randomness) it could also be satisfied by randomizing the selection of the spherical patch on the Huygens’ wavelet (a la stereology).

This much part, I in fact had already completed when I was right at UAB, completely on my own, though I had never shared this idea with anyone. I guess it was already over before 1992 came to an end.

More than a decade later, now in Pune: Starting with Gauss’ theorem, and touching on the Huygens process and stereology, and now, also throwing in the vector addition rules for ensuring that right phases appear throughout the propagation, and so, local amplitudes also come out right in the large-flux situation, I could get to my diffusive dynamics for the spatially discrete photons.

I did suspect that this procedure (of randomizing the selection of a point on any of Huygens’ wavelets) meant that the photons would have to be imagined either as (i) getting scattered everywhere during their propagation, or (ii) possibly getting annihilated after travelling even just an infinitesimal distance in empty space, and then, somehow, also getting re-created  (the time lag between the annihilation and the subsequent creation being zero), effectively satisfying the conservation principle. On either count, the photon would keep changing its directions randomly, because the point on the surface of the Huygens wavelet was randomized.

Of course, I could not figure out a good physical reason for such a process.

Scattering of one photon by other photons seemed implausible—though I couldn’t figure out any particular reason why it would be implausible. Anyway reliance on scattering led to an impossible situation when there was only one photon inside the interference chamber.

There also was no proper physicist who would even so much as be willing to just listen to me. (I tried more than 15–20.) On the other hand, so many leading ones among them were offering descriptions of QM in terms of a random “quantum foam/froth” which produces and annihilates any particles anywhere anytime—even massive ones and even in empty space at any random time. So, I thought that my idea of continuous disappearance and appearance but in a different direction, would not be found too odd.

(Discussions of foundations of QM has improved by leaps and bounds since engineers started taking interest in building QC. In fact, recently, a somewhat similar remark also came from Dr. Sabine Hossenfelder on her blog. But I am talking of those days—around 2005 times.)

Of course, since I myself didn’t have even an iota of a physical understanding regarding such virtual annihilation/creation pairs for photons, but since they were necessary in my scheme because I had randomized not the source point but the Huygens surface, rather than going full wacko (as most any physicist in my situation would), I did what any graduate student of engineering would do: I simply refrained from mentioning any such implications for a possible physics of it, and instead chose to phrase my description of the process in terms which heavily relied on the well-established, well-reputed, classical principle of Huygens’.

No one ever asked any questions on this part either. Neither in conference, nor in PhD defence, nor even after sharing my papers with physicists (some of who had on their own requested my papers). So, it kindaa went through!

Phewww…. All the hoops that a hapless PhD student has to jump through, just to get to his degree! (In my case, it was even worse: these were the closed surfaces of the Huygens wavelets, not mere closed curves as in the hoops.)

So, that’s how I had arrived at my PhD time approach. I did it by randomizing the spherical surfaces employed in the Huygens’ process, and by imagining a spatially discrete particle of the photon at all such locations at each one of the subsequent instants. The movement of the photon, when it goes on cutting the respective surfaces of all the freshly generated series of Huygens’ wavelets, when the cutting is randomized, obviously forms a simplest kind of a Weiner process—it’s the direct counterpart of the random-walks, but for wave-fields.

People right from Ulam et al. had proposed and used random walks (aka Monte Carlo) for diffusive and potential fields, for 50+ years. However, none had added just some more calculations with the wave- and displacement-vectors to account for the phases, and thereby generalized the random-walks to be able to handle the wavefields too. That was another neat thing to know. (Yes, please, do go ahead! Do hunt for the precedents!!)

Anyway, that’s how the FAQ dynamics came to be diffusive.

And all said and done, it did come to reproduce a seemingly same kind of a transition from a pattern of random dots to the Young interference pattern as experiments had shown!

One final point. But why did I disregard the ballistic dynamics—which would have all randomness concentrated only in the source and let photons fly straight? Yes, come to think of it, if you do assume a spatially discrete nature for the photon, then there is obviously no good reason to deny such a possibility.

Here, I am not sure, because I don’t remember having writing down any note on it. So it’s kindaa hard to tell now, from a distance of years. I will try to reconstruct some possible considerations starting from some indirect points, and purely from memory.

I seem to recall that I was apprehensive that what I called “size effects” might come into picture and make this approach unsound. I mean to say, a perfectly uniform randomness (distributed over the entire emitter surface) was hard to imagine as the emitter surface became ever smaller, and reached the natural limit of a single atom. For one thing, the emitted quantity might get affected, I thought. Secondly, single atoms, acting as emitters, had to have some directionality to their emissions because their orbitals [whatever it meant—I didn’t have a good idea about them back then] weren’t always spherically symmetric. I think I had considered this point.

Did I consider the delayed-choice kind of considerations? I think I did, but in some simple indirect ways, not very carefully or systematically. I mean to say, I don’t remember going through write-ups on the delayed-choice experiments at all, and then taking any decision. I rather remember thinking in terms like a camera shutter suddenly coming in the way of a photon when it’s still in mid-flight and all. If the shutter were to be a perfect sink (one that didn’t re-emit the photon), or if it were to re-emit photons from a different location on the shutter surface (after internal energy undergoing some unpredictable oscillations within the shutter material), then it would adversely affect the final pattern on the screen, I had thought. The real-time changes for the propagating photon might get better handled by distributing randomness over the entire spatial region of the chamber, I had thought.

But I think that all in all, it wasn’t any such careful consideration. I chose the randomized Huygens’ process because I thought it gave good enough an explanation.

In the final analysis, there are too many problems with this entire approach—even with just a spatially discrete photon anyway, and furthermore if it comes embedded in a description that has no IAD anywhere in itself. Some or the other part of QM will then have to keep getting violated. You just can’t avoid it. So, the best way to understand QM is not to begin with photons but with electrons—and with the Schrodinger formalism. The measurement problem is the only remaining issue then.


5. Homework for the skeptics among you:

Go through my PhD abstract posted at iMechanica even before the defence [(.PDF) ^], and check out if what I wrote above, purely on the fly and purely from memory, matches with what I had officially reported back then, or not. If you find serious discrepancies, please bring them to my notice. Thanks in advance.


Of course, now that I’ve completely abandoned the grainy description of photons as the actual physical reality, all the above doesn’t much matter. FAQ, even if valid, would have to be taken as only a higher-level, abstract description of an entirely different kind of a mechanism.

So, let’s leave this entire PhD-time approach right behind us (forever), and continue with the next tweet in this series. They directly deal with the aspects of my latest approach (as in the Outline document)… However, I will pick it up in the next post. It’s almost 5900 words already! Give me a break of at least a 10–15 days. Until then, take care and goodbye.


A song I like:

(Marathi) “ambaraatalyaa niLyaa ghanaachee”
Singer: Ramdas Kamat
Music and Lyrics: Veena Chitako

 

Why is the research on the foundations of QM necessary?

Why is the research on the foundations of QM necessary? … This post is meant to hold together some useful links touching on various aspects of this question.


Bob Doyle

He has interests in philosophy but has a PhD in astrophysics from Harvard. He maintains not just an isolated page on the measurement problem, but a whole compendium of them, which together touch on all issues related QM—and these form just a part of his Web site which also deals with many issues from philosophy proper like free-will, mind, knowledge, values, etc. Added attraction: He also keeps papers of historical relevance (like Schrodinger’s paper on quantum jumps, for instance).

His page on the measurement problem is very fascinating. He mentions all the relevant issues (including giving links to the topics), summarizes all the important positions in a very accurate manner (quoting passages from historically important papers). You are bound to get just the right kind of a perspective on this problem if you refer to this page and (what is easy to state): “all the references therein”!. Here is the page: [^] (which I had noted in my Twitter feed on 25 August 2019).

[This section added on 2019.09.18 07:43 IST]


Sabine Hossenfelder:

See her blog post: “Good Problems in the Foundations of Physics” [^]. Go through the entirety of the first half of the post, and then make sure to check out the paragraph of the title “The Measurement Problem” from her list.

Not to be missed: Do check out the comment by Peter Shor, here [^], and Hossenfelder’s reply to it, here [^]. … If you are familiar with the outline of my new approach [^], then it would be very easy to see why I must have instantaneously found her answer to be so absolutely wonderful! … Being a reply to a comment, she must have written it much on the fly. Even then, she not only correctly points out the fact that the measurement process must be nonlinear in nature, she also mentions that you have to give a “bottom-up” model for the Instrument. …Wow! Simply, wow!!

Update (2019.09.18 07:43 IST): Also see a post she wrote a few months later: “The Problem with Quantum Measurements”, [^]. It generated 450 comments, but not many were too inspiring!


Lee Smolin:

Here is one of the most lucid and essence-capturing accounts concerning this topic that I have ever run into [^]. Smolin wrote it in response to the Edge Question, 2013 edition. It wonderfully captures the very essence of the confusions which were created and / or faced by all the leading mainstream physicists of the past—the confusions which none of them could get rid of—with the list including even such Nobel-laureates as Bohr, Einstein, Heisenberg, Pauli, de Broglie, Schrodinger, Dirac, and others. [Yes, in case you read the names too rapidly: this list does include Einstein too!]


Sean Carroll:

He explains at his blog how a lack of good answers on the foundational issues in QM leads to “the most embarrassing graph in modern physics” [^]. This post was further discussed in several other posts in the blogosphere. The survey paper which prompted Carroll’s post can be found at arXiv, here [^]. Check out the concept maps given in the paper, too. Phillip Ball’s coverage in the Nature News of this same paper can be found here [^].


Adrian Kent:

See his pop-sci level paper “Quantum theory’s reality problem,” at arXiv [^]. He originally wrote it for Aeon in 2014, and then revised it in 2018 while posting at arXiv. Also notable is his c. 2000 paper: “Night thoughts of a quantum physicist,” Phil. Trans. R. Soc. Lond. A, vol. 358, 75–87. As to the fifth section (“Postscript”) of this second paper, I am fully confident that no one would have to wait either until the year 2999, or for any one of those imagined extraterrestrial colleagues to arrive on the scene. Further, I am also fully confident that no mechanical “colleagues” are ever going to be around.

[Added on 2019.05.05 15:41 IST]


…What Else?:

What else but the Wiki!… See here [^], and then, also here [^].


OK. This all should make for an adequate response, at least for the time being, to those physicists (or physics professors) who tend to think that the foundational issues do not make for “real” physics, that it is a non-issue. … However, for obvious reasons, this post will also remain permanently under updates…

Revision History:

2019.04.15: First published
2019.04.16: Some editing/streamlining
2019.05.05: Added the paper by Prof. Kent.
2019.09.18: Added the section on Bob Doyle. Added a recent post by Sabine Hossenfelder.

 

The self-field, and the objectivity of the classical electrostatic potentials: my analysis

This blog post continues from my last post, and has become overdue by now. I had promised to give my answers to the questions raised last time. Without attempting to explain too much, let me jot down the answers.


1. The rule of omitting the self-field:

This rule arises in electrostatic interactions basically because the Coulombic field has a spherical symmetry. The same rule would also work out in any field that has a spherical symmetry—not just the inverse-separation fields, and not necessarily only the singular potentials, though Coulombic potentials do show both these latter properties too.

It is helpful here to think in terms of not potentials but of forces.

Draw any arbitrary curve. Then, hold one end of the curve fixed at the origin, and sweep the curve through all possible angles around it, to get a 3D field. This 3D field has a spherical symmetry, too. Hence, gradients at the same radial distance on opposite sides of the origin are always equal and opposite.

Now you know that the negative gradient of potential gives you a force. Since for any spherical potential the gradients are equal and opposite, they cancel out. So, the forces cancel out to.

Realize here that in calculating the force exerted by a potential field on a point-particle (say an electron), the force cannot be calculated in reference to just one point. The very definition of the gradient refers to two different points in space, even if they be only infinitesimally separated apart. So, the proper procedure is to start with a small sphere centered around the given electron, calculate the gradients of the potential field at all points on the surface of this sphere, calculate the sum of the forces exerted on the domain contained inside the spherical surface by these forces, and then take the sphere to the limiting of vanishing size. The sum of the forces thus exerted is the net force acting on that point-particle.

In case of the Coulombic potentials, the forces thus calculated on the surface of any sphere (centered on that particle) turn out to be zero. This fact holds true for spheres of all radii. It is true that gradients (and forces) progressively increase as the size of the sphere decreases—in fact they increase without all bounds for singular potentials. However, the aforementioned cancellation holds true at any stage in the limiting process. Hence, it holds true for the entirety of the self-field.

In calculating motions of a given electron, what matters is not whether its self-field exists or not, but whether it exerts a net force on the same electron or not. The self-field does exist (at least in the sense explained later below) and in that sense, yes, it does keep exerting forces at all times, also on the same electron. However, due to the spherical symmetry, the net force that the field exerts on the same electron turns out to be zero.

In short:

Even if you were to include the self-field in the calculations, if the field is spherically symmetric, then the final net force experienced by the same electron would still have no part coming from its own self-field. Hence, to economize calculations without sacrificing exactitude in any way, we discard it out of considerations.The rule of omitting the self-field is just a matter of economizing calculations; it is not a fundamental law characterizing what field may be objectively said to exist. If the potential field due to other charges exists, then, in the same sense, the self-field too exists. It’s just that for the motions of the self field-generating electron, it is as good as non-existent.

However, the question of whether a potential field physically exists or not, turns out to be more subtle than what might be thought.


2. Conditions for the objective existence of electrostatic potentials:

It once again helps to think of forces first, and only then of potentials.

Consider two electrons in an otherwise empty spatial region of an isolated system. Suppose the first electron (e_1), is at a position x_1, and a second electron e_2 is at a position x_2. What Coulomb’s law now says is that the two electrons mutually exert equal and opposite forces on each other. The magnitudes of these forces are proportional to the inverse-square of the distance which separates the two. For the like charges, the forces is repulsive, and for unlike charges, it is attractive. The amount of the electrostatic forces thus exerted do not depend on mass; they depend only the amounts of the respective charges.

The potential energy of the system for this particular configuration is given by (i) arbitrarily assigning a zero potential to infinite separation between the two charges, and (ii) imagining as if both the charges have been brought from infinity to their respective current positions.

It is important to realize that the potential energy for a particular configuration of two electrons does not form a field. It is merely a single number.

However, it is possible to imagine that one of the charges (say e_1) is held fixed at a point, say at \vec{r}_1, and the other charge is successively taken, in any order, at every other point \vec{r}_2 in the infinite domain. A single number is thus generated for each pair of (\vec{r}_1, \vec{r}_2). Thus, we can obtain a mapping from the set of positions for the two charges, to a set of the potential energy numbers. This second set can be regarded as forming a field—in the 3D space.

However, notice that thus defined, the potential energy field is only a device of calculations. It necessarily refers to a second charge—the one which is imagined to be at one point in the domain at a time, with the procedure covering the entire domain. The energy field cannot be regarded as a property of the first charge alone.

Now, if the potential energy field U thus obtained is normalized by dividing it with the electric charge of the second charge, then we get the potential energy for a unit test-charge. Another name for the potential energy obtained when a unit test-charge is used for the second charge is: the electrostatic potential (denoted as V).

But still, in classical mechanics, the potential field also is only a device of calculations; it does not exist as a property of the first charge, because the potential energy itself does not exist as a property of that fixed charge alone. What does exist is the physical effect that there are those potential energy numbers for those specific configurations of the fixed charge and the test charge.

This is the reason why the potential energy field, and therefore the electrostatic potential of a single charge in an otherwise empty space does not exist. Mathematically, it is regarded as zero (though it could have been assigned any other arbitrary, constant value.)

Potentials arise only out of interaction of two charges. In classical mechanics, the charges are point-particles. Point-particles exist only at definite locations and nowhere else. Therefore, their interaction also must be seen as happening only at the locations where they do exist, and nowhere else.

If that is so, then in what sense can we at all say that potential energy (or electrostaic potential) field does physically exist?

Consider a single electron in an isolated system, again. Assume that its position remains fixed.

Suppose there were something else in the isolated system—-something—some object—every part of which undergoes an electrostatic interaction with the fixed (first) electron. If this second object were to be spread all over the domain, and if every part of it were able to interact with the fixed charge, then we could say that the potential energy field exists objectively—as an attribute of this second object. Ditto, for the electric potential field.

Note three crucially important points, now.

2.1. The second object is not the usual classical object.

You cannot regard the second (spread-out) object as a mere classical charge distribution. The reason is this.

If the second object were to be actually a classical object, then any given part of it would have to electrostatically interact with every other part of itself too. You couldn’t possibly say that a volume element in this second object interacts only with the “external” electron. But if the second object were also to be self-interacting, then what would come to exist would not be the simple inverse-distance potential field energy, in reference to that single “external” electron. The space would be filled with a very weird field. Admitting motion to the property of the local charge in the second object, every locally present charge would soon redistribute itself back “to” infinity (if it is negative), or it all would collapse into the origin (if the charge on the second object were to be positive, because the fixed electron’s field is singular). But if we allow no charge redistributions, and the second field were to be classical (i.e. capable of self-interacting), then the field of the second object would have to have singularities everywhere. Very weird. That’s why:

If you want to regard the potential field as objectively existing, you have to also posit (i.e. postulate) that the second object itself is not classical in nature.

Classical electrostatics, if it has to regard a potential field as objectively (i.e. physically) existing, must therefore come to postulate a non-classical background object!

2.2. Assuming you do posit such a (non-classical) second object (one which becomes “just” a background object), then what happens when you introduce a second electron into the system?

You would run into another seeming contradiction. You would find that this second electron has no job left to do, as far as interacting with the first (fixed) electron is concerned.

If the potential field exists objectively, then the second electron would have to just passively register the pre-existing potential in its vicinity (because it is the second object which is doing all the electrostatic interactions—all the mutual forcings—with the first electron). So, the second electron would do nothing of consequence with respect to the first electron. It would just become a receptacle for registering the force being exchanged by the background object in its local neighborhood.

But the seeming contradiction here is that as far as the first electron is concerned, it does feel the potential set up by the second electron! It may be seen to do so once again via the mediation of the background object.

Therefore, both electrons have to be simultaneously regarded as being active and passive with respect to each other. They are active as agents that establish their own potential fields, together with an interaction with the background object. But they also become passive in the sense that they are mere point-masses that only feel the potential field in the background object and experience forces (accelerations) accordingly.

The paradox is thus resolved by having each electron set up a field as a result of an interaction with the background object—but have no interaction with the other electron at all.

2.3. Note carefully what agency is assigned to what object.

The potential field has a singularity at the position of that charge which produces it. But the potential field itself is created either by the second charge (by imagining it to be present at various places), or by a non-classical background object (which, in a way, is nothing but an objectification of the potential field-calculation procedure).

Thus, there arises a duality of a kind—a double-agent nature, so to speak. The potential energy is calculated for the second charge (the one that is passive), in the sense that the potential energy is relevant for calculating the motion of the second charge. That’s because the self-field cancels out for all motions of the first charge. However,

 The potential energy is calculated for the second charge. But the field so calculated has been set up by the first (fixed) charge. Charges do not interact with each other; they interact only with the background object.

2.4. If the charges do not interact with each other, and if they interact only with the background object, then it is worth considering this question:

Can’t the charges be seen as mere conditions—points of singularities—in the background object?

Indeed, this seems to be the most reasonable approach to take. In other words,

All effects due to point charges can be regarded as field conditions within the background object. Thus, paradoxically enough, a non-classical distributed field comes to represent the classical, massive and charged point-particles themselves. (The mass becomes just a parameter of the interactions of singularities within a 3D field.) The charges (like electrons) do not exist as classical massive particles, not even in the classical electrostatics.


3. A partly analogous situation: The stress-strain fields:

If the above situation seems too paradoxical, it might be helpful to think of the stress-strain fields in solids.

Consider a horizontally lying thin plate of steel with two rigid rods welded to it at two different points. Suppose horizontal forces of mutually opposite directions are applied through the rods (either compressive or tensile). As you know, as a consequence, stress-strain fields get set up in the plate.

From an external viewpoint, the two rods are regarded as interacting with each other (exchanging forces with each other) via the medium of the plate. However, in reality, they are interacting only with the object that is the plate. The direct interaction, thus, is only between a rod and the plate. A rod is forced, it interacts with the plate, the plate sets up stress-strain field everywhere, the local stress-field near the second rod interacts with it, and the second rod registers a force—which balances out the force applied at its end. Conversely, the force applied at the second rod also can be seen as getting transmitted to the first rod via the stress-strain field in the plate material.

There is no contradiction in this description, because we attribute the stress-strain field to the plate itself, and always treat this stress-strain field as if it came into existence due to both the rods acting simultaneously.

In particular, we do not try to isolate a single-rod attribute out of the stress-strain field, the way we try to ascribe a potential to the first charge alone.

Come to think of it, if we have only one rod and if we apply force to it, no stress-strain field would result (i.e. neglecting inertia effects of the steel plate). Instead, the plate would simply move in the rigid body mode. Now, in solid mechanics, we never try to visualize a stress-strain field associated with a single rod alone.

It is a fallacy of our thinking that when it comes to electrostatics, we try to ascribe the potential to the first charge, and altogether neglect the abstract procedure of placing the test charge at various locations, or the postulate of positing a non-classical background object which carries that potential.

In the interest of completeness, it must be noted that the stress-strain fields are tensor fields (they are based on the gradients of vector fields), whereas the electrostatic force-field is a vector field (it is based on the gradient of the scalar potential field). A more relevant analogy for the electrostatic field, therefore might the forces exchanged by two point-vortices existing in an ideal fluid.


4. But why bother with it all?

The reason I went into all this discussion is because all these issues become important in the context of quantum mechanics. Even in quantum mechanics, when you have two charges that are interacting with each other, you do run into these same issues, because the Schrodinger equation does have a potential energy term in it. Consider the following situation.

If an electrostatic potential is regarded as being set up by a single charge (as is done by the proton in the nucleus of the hydrogen atom), but if it is also to be regarded as an actually existing and spread out entity (as a 3D field, the way Schrodinger’s equation assumes it to be), then a question arises: What is the role of the second charge (e.g., that of the electron in an hydrogen atom)? What happens when the second charge (the electron) is represented quantum mechanically? In particular:

What happens to the potential field if it represents the potential energy of the second charge, but the second charge itself is now being represented only via the complex-valued wavefunction?

And worse: What happens when there are two electrons, and both interacting with each other via electrostatic repulsions, and both are required to be represented quantum mechanically—as in the case of the electrons in an helium atom?

Can a charge be regarded as having a potential field as well as a wavefunction field? If so, what happens to the point-specific repulsions as are mandated by the Coulomb law? How precisely is the V(\vec{r}_1, \vec{r}_2) term to be interpreted?

I was thinking about these things when these issues occurred to me: the issue of the self-field, and the question of the physical vs. merely mathematical existence of the potential fields of two or more quantum-mechanically interacting charges.

Guess I am inching towards my full answers. Guess I have reached my answers, but I need to have them verified with some physicists.


5. The help I want:

As a part of my answer-finding exercises (to be finished by this month-end), I might be contacting a second set of physicists soon enough. The issue I want to learn from them is the following:

How exactly do they do computational modeling of the helium atom using the finite difference method (FDM), within the context of the standard (mainstream) quantum mechanics?

That is the question. Once I understand this part, I would be done with the development of my new approach to understanding QM.

I do have some ideas regarding the highlighted question. It’s just that I want to have these ideas confirmed from some physicists before (or along-side) implementing the FDM code. So, I might be approaching someone—possibly you!

Please note my question once again. I don’t want to do perturbation theory. I would also like to avoid the variational method.

Yes, I am very comfortable with the finite element method, which is basically based on the variational calculus. So, given a good (detailed enough) account of the variational method for the He atom, it should be possible to translate it into the FEM terms.

However, ideally, what I would like to do is to implement it as an FDM code.

So there.

Please suggest good references and / or people working on this topic, if you know any. Thanks in advance.


A song I like:

[… Here I thought that there was no song that Salil Chowdhury had composed and I had not listened to. (Well, at least when it comes to his Hindi songs). That’s what I had come to believe, and here trots along this one—and that too, as a part of a collection by someone! … The time-delay between my first listening to this song, and my liking it, was zero. (Or, it was a negative time-delay, if you refer to the instant that the first listening got over). … Also, one of those rare occasions when one is able to say that any linear ordering of the credits could only be random.]

(Hindi) “mada bhari yeh hawaayen”
Music: Salil Chowdhury
Lyrics: Gulzaar
Singer: Lata Mangeshkar