Yesss! I did it!

Last evening (on 2021.01.13 at around 17:30 IST), I completed the first set of computations for finding the bonding energy of a helium atom, using my fresh new approach to QM.

These calculations still are pretty crude, both by technique and implementation. Reading through the details given below, any competent computational engineer/scientist would immediately see just how crude they are. However, I also hope that he would also see that I can still say that these initial results may be taken as definitely validating my new approach.

It would be impossible to give all the details right away. So, what I give below are some important details and highlights of the model, the method, and the results.

For that matter, even my Python scripts are currently in a pretty disorganized state. They are held together by duct-tape, so to say. I plan to rearrange and clean up the code, write a document, and upload them both. I think it should be possible to do so within a month’s time, i.e., by mid-February. If not, say due to the RSI, then probably by February-end.

Alright, on to the details. (I am giving some indication about some discarded results/false starts too.)


1. Completion of the theory:

As far as development of my new theory goes, there were many tricky issues that had surfaced since I began trying simulating my new approach, which was starting in May–June 2020. The crucially important issues were the following:

  • A quantitatively precise statement on how the mainstream QM’s \Psi, defined as it is over the 3N-dimensional configuration space, relates to the 3-dimensional wavefunctions I had proposed earlier in the Outline document.
  • A quantitatively precise statement on how the wavefunction \Psi makes the quantum particles (i.e. their singularity-anchoring positions) move through the physical space. Think of this as the “force law”, and then note that if a wrong statement is made here, then the entire system dynamics/evolution has to go wrong. Repurcussions will exist even in a simplest system having two interacting particles, like the helium atom. The bonding energy calculations of the helium atom are bound to go wrong if the “force law” is wrong. (I don’t actually calculate the forces, but that’s a different matter.)
  • Also to be dealt with was this issue: Ensuring that the anti-symmetry property for the indistinguishable fermions (electrons) holds.

I had achieved a good clarity on all these (and similar other) matters by the evening of 5th January 2021. I also tried to do a few simulations but ran into problem. Both these developments were mentioned via an update at iMechanica on the evening of 6th January 2021, here [^].


2. Simulations in 1D boxes:

By “box” I mean a domain having infinite potential energy walls at the boundaries, and imposition of the Dirichlet condition of \Psi(x,t) = 0 at the boundaries at all times.

I did a rapid study of the problems (mentioned in the iMechanica update). The simulations for this study involved 1D boxes from 5 a.u. to 100 a.u. lengths. (1 a.u. of length = 1 Bohr radius.) The mesh sizes varied from 5 nodes to 3000 nodes. Only regular, structured meshes of uniform cell-sides (i.e., a constant inter-nodal distance, \Delta x) were used, not non-uniform meshes (such as log-based).

I found that the discretization of the potential energy (PE) term indeed was at the root of the problems. Theoretically, the PE field is singular. I have been using FDM. Since an infinite potential cannot be handled using FDM, you have to implement some policy in giving a finite value for the maximum depth of the PE well.

Initially, I chose the policy of setting the max. depth to that value which would exist at a distance of half the width of the cell. That is to say, V_S \approx V(\Delta x/2), where V_S denotes the PE value at the singularity (theoretically infinite).

The PE was calculated using the Coulomb formula, which is given as V(r) = 1/r when one of the charges is fixed, and as V_1(r_s) = V_2(r_s) = 1/(2r_s) for two interacting and moving charges. Here, r_s denotes the separation between the interacting charges. The rule of half cell-side was used for making the singularity finite. The field so obtained will be referred to as the “hard” PE field.

Using the “hard” field was, if I recall it right, quite OK for the hydrogen atom. It gave the bonding energies (ground-state) ranging from -0.47 a.u. to -0.49 a.u. or lower, depending on the domain size and mesh refinement (i.e. number of nodes). Note, 1 a.u. of energy is the same as 1 hartree. For comparison, the analytical solution gives -0.5, exactly. All energy calculations given here refer to only the ground-state energies. However, I also computed and checked up to 10 eigenvalues.

Initially, I tried both dense and sparse eigenvalue solvers, but eventually settled only on the sparse solvers. The results were indistinguishable (at least numerically) . I used SciPy’s wrappings for the various libraries.

I am not quite sure whether using the hard potential was always smooth or not, even for the hydrogen atom. I think not.

However, the hard Coulomb potential always led to problems for the helium atom in a 1D box (being modelled using my new approach/theory). The lowest eigen-value was wrong by more than a factor of 10! I verified that the corresponding eigenvector indeed was an eigenvector. So, the solver was giving a technically correct answer, but it was an answer to the as-discretized system, not to the original physical problem.

I therefore tried using the so-called “soft” Coulomb potential, which was new to me, but looks like it’s a well known function. I came to know of its existence via the OctopusWiki [^], when I was searching on some prior code on the helium atom. The “soft” Coulomb potential is defined as:

V = \dfrac{1}{\sqrt{(a^2 + x^2)}}, where a is an adjustable parameter, often set to 1.

I found this potential unsatisfactory for my work, mainly because it gives rise to a more spread-out wavefunction, which in turn implies that the screening effect of one electron for the other electron is not captured well. I don’t recall exactly, but I think that there was this issue of too low ground-state eigenvalues also with this potential (for the helium modeling). It is possible that I was not using the right SciPy function-calls for eigenvalue computations.

Please take the results in this section with a pinch of salt. I am writing about them only after 8–10 days, but I have written so many variations that I’ve lost the track of what went wrong in what scenario.

All in all, I thought that 1D box wasn’t working out satisfactorily. But a more important consideration was the following:

My new approach has been formulated in the 3D space. If the bonding energy is to be numerically comparable to the experimental value (and not being computed as just a curiosity or computational artifact) then the potential-screening effect must be captured right. Now, here, my new theory says that the screening effect will be captured quantitatively correctly only in a 3D domain. So, I soon enough switched to the 3D boxes.


3. Simulations of the hydrogen atom in 3D boxes:

For both hydrogen and helium, I used only cubical boxes, not parallelpipeds (“brick”-shaped boxes). The side of the cube was usually kept at 20 a.u. (Bohr radii), which is a length slightly longer than one nanometer (1.05835 nm). However, some of my rapid experimentation also ranged from 5 a.u. to 100 a.u. domain lengths.

Now, to meshing

The first thing to realize is that with a 3D domain, the total number of nodes M scales cubically with the number of nodes n appearing on a side of the cube. That is to say: M = n^3. Bad thing.

The second thing to note is worse: The discretized Hamiltonian operator matrix now has the dimensions of M \times M. Sparse matrices are now a must. Even then, meshes remain relatively coarse, else computation time increases a lot!

The third thing to note is even worse: My new approach requires computing “instantaneous” eigenvalues at all the nodes. So, the number of times you must give a call to, say eigh() function, also goes as M = n^3. … Yes, I have the distinction of having invented what ought to be, provably, the most inefficient method to compute solutions to many-particle quantum systems. (If you are a QC enthusiast, now you know that I am a completely useless fellow.) But more on this, just a bit later.

I didn’t have to write the 3D code completely afresh though. I re-used much of the backend code from my earlier attempts from May, June and July 2020. At that time, I had implemented vectorized code for building the Laplacian matrix. However, in retrospect, this was an overkill. The system spends more than 99 % of execution time only in the eigenvalue function calls. So, preparation of the discretized Hamiltonian operator is relatively insignificant. Python loops could do! But since the vectorized code was smaller and a bit more easily readable, I used it.

Alright.

The configuration space for the hydrogen atom is small, there being only one particle. It’s “only” M in size. More important, the nucleus being fixed, and there being just one particle, I need to solve the eigenvalue equation only once. So, I first put the hydrogen atom inside the 3D box, and verified that the hard Coulomb potential gives cool results over a sufficiently broad range of domain sizes and mesh refinements.

However, in comparison with the results for the 1D box, the 3D box algebraically over-estimates the bonding energy. Note the word “algebraically.” What it means is that if the bonding energy for a H atom in a 1D box is -0.49 a.u., then with the same physical domain size (say 20 Bohr radii) and the same number of nodes on the side of the cube (say 51 nodes per side), the 3D model gives something like -0.48 a.u. So, when you use a 3D box, the absolute value of energy decreases, but the algebraic value (including the negative sign) increases.

As any good computational engineer/scientist could tell, such a behaviour is only to be expected.

The reason is this: The discretized PE field is always jagged, and so it only approximately represents a curvy function, especially near the singularity. This is how it behaves in 1D, where the PE field is a curvy line. But in a 3D case, the PE contour surfaces bend not just in one direction but in all the three directions, and the discretized version of the field can’t represent all of them taken at the same time. That’s the hand-waving sort of an “explanation.”

I highlighted this part because I wanted you to note that in 3D boxes, you would expect the helium atom energies to algebraically overshoot too. A bit more on this, later, below.


4. Initial simulations of the helium atom in 3D boxes:

For the helium atom too, the side of the cube was mostly kept at 20 a.u. Reason?

In the hydrogen atom, the space part of the ground state \psi has a finite peak at the center, and its spread is significant over a distance of about 5–7 a.u. (in the numerical solutions). Then, for the helium atom, there is going to be a dent in the PE field due to screening. In my approach, this dent physically moves over the entire domain as the screening electron moves. To accommodate both their spreads plus some extra room, I thought, 20 could be a good choice. (More on the screening effect, later, below.)

As to the mesh: As mentioned earlier, the number of eigenvalue computations required are M, and the time taken by each such a call goes significantly up with M. So, initially, I kept the number of nodes per side (i.e. n) at just 23. With two extreme planes sacrificed to the deity of the boundary conditions, the actual computations actually took place on a 21 \times 21 \times 21 mesh. That still means, a system having 9261 nodes!

At the same time, realize how crude and coarse mesh this one is: Two neighbouring nodes represent a physical distance of almost one Bohr radius! … Who said theoretical clarity must come also with faster computations? Not when it’s QM. And certainly not when it’s my theory! I love to put the silicon chip to some real hard work!

Alright.

As I said, for the reasons that will become fully clear only when you go through the theory, my approach requires M number of separate eigenvalue computation calls. (In “theory,” it requires M^2 number of them, but some very simple and obvious symmetry considerations reduce the computational load to M.) I then compute the normalized 1-particle wavefunctions from the eigenvector. All this computation forms what I call the first phase. I then post-process the 1-particle wavefunctions to get to the final bonding energy. I call this computation the second phase.

OK, so in my first computations, the first phase involved the SciPy’s eigsh() function being called 9261 number of times. I think it took something like 5 minutes. The second phase is very much faster; it took less than a minute.

The bonding energy I thus got should have been around -2.1 a.u. However, I made an error while coding the second phase, and got something different (which I no longer remember, but I think I have not deleted the wrong code, so it should be possible to reproduce this wrong result). The error wasn’t numerically very significant, but it was an error all the same. This status was by the evening of 11th January 2021.

The same error continued also on 12th January 2021, but I think that if the errors in the second phase were to be corrected, the value obtained could have been close to -2.14 a.u. or so. Mind you, these are the results with a 20 a.u. box and 23 nodes per side.

In comparison, the experimental value is -2.9033 a.u.

As to computations, Hylleraas, back in 1927 a PhD student, used a hand-held mechanical calculator, and still got to -2.90363 a.u.! Some 95+ years later, his method and work still remain near the top of the accuracy stack.

Why did my method do so bad? Even more pertinent: How could Hylleraas use just a mechanical calculator, not a computer, and still get to such a wonderfully accurate result?

It all boils down to the methods, tricks, and even dirty tricks. Good computational engineers/scientists know them, their uses and limitations, and do not hesitate building products with them.

But the real pertinent reason is this: The technique Hylleraas used was variational.


5. A bit about the variational techniques:

All variational techniques use a trial function with some undetermined parameters. Let me explain in a jiffy what it means.

A trial function embodies a guess—a pure guess—at what the unknown solution might look like. It could be any arbitrary function.

For example, you could even use a simple polynomial like y = a_0 + a_1 x_1 + a_2 x_2^2 + a_3 x_3^3 by way of a trial function.

Now, observe that if you change the values of the a_0, a_1 etc. coefficients, then the shape of the function changes. Just assign some random values and plot the results using MatPlotLib, and you will know what I mean.

… Yes, you do something similar also in Data Science, but there, the problem formulation is relatively much simpler: You just tweak all the a_i coefficients until the function fits the data. “Curve-fitting,” it’s called.

In contrast, in variational calculus, you don’t do this one-step curve-fitting. You instead take the y function and substitute it into some theoretical equations that have something to do with the total energy of the system. Then you find an expression which tells how the energy, now expressed as a function of y, which itself is a function of a_i‘s, varies as these unknown coefficients a_i are varied. So, these a_i‘s basically act as parameters of the model. Note carefully, the y function is the primary unknown function, but in variational calculus, you do the curve-fitting with respect to some other equation.

So, the difference between simple curve-fitting and variational methods is the following. In simple curve-fitting, you fit the curve to concrete data values. In variational calculus, you fit an expression derived by substituting the curve into some equations (not data), and then derive some further equations that show how some measure like energy changes with variations in the parameters. You then adjust the parameters so as to minimize that abstract measure.

Coming back to the helium atom, there is a nucleus with two protons inside it, and two electrons that go around the nucleus. The nucleus pulls both the electrons, but the two electrons themselves repel each other. (Unlike and like charges.) When one electron strays near the nucleus, it temporarily decreases the effective pull exerted by the nucleus on the other electron. This is called the screening effect. In short, when one electron goes closer to the nucleus, the other electron feels as if the nucleus had discharged a little bit. The effect gets more and more pronounced as the first electron goes closer to the nucleus. The nucleus acts as if it had only one proton when the first electron is at the nucleus. The QM particles aren’t abstractions from the rigid bodies of Newtonian mechanics; they are just singularity conditions in the aetherial fields. So, it’s easily possible that an electron sits at the same place where the two protons of the nucleus are.

One trouble with using the variational techniques for problems like modeling the helium atom is this. It models the screening effect using a numerically reasonable but physically arbitrary trial function. Using this technique can give a very accurate result for bonding energy, provided that the person building the variational model is smart, as Hylleraas sure was. But the trial function is just a guess work. It can’t be said to capture any physics, as such. Let me give an example.

Suppose that some problem from physics is such that a 5-degree polynomial happens to be the physically accurate form of solution for it. However, you don’t know the analytical solution, not even its form.

Now, the variational technique doesn’t prevent you from using a cubic polynomial as the trial function. That’s because, even if you use a cubic polynomial, you can still get to the same total system energy.

The actual calculations are far more complicated, but just as a fake example to illustrate my main point, suppose for a moment that the area under the solution curve is the target criterion (and not a more abstract measure like energy). Now, by adjusting the height and shape of a cubic polynomial, you can always alter its shape such that it happens to give the right area under the curve. Now, the funny part is this. If the trial function we choose is only cubic, then it is certain to miss, as a matter of a general principle, all the information related to the 3rd- and 4th-order derivatives. So, the solution will have a lot of high-order physics deleted from itself. It will be a bland solution; something like a ghost of the real thing. But it can still give you the correct area under the curve. If so, it still fulfills the variational criterion.

Coming back to the use of variational techniques in QM, like Hylleraas’ method:

It can give a very good answer (even an arbitrarily accurate answer) for the energy. But the trial function can still easily miss a lot of physics. In particular, it is known that the wavefunctions (actually, “orbitals”) won’t turn out to be accurate; they won’t depict physical entities.

Another matter: These techniques work not in the physical space but in the configuration space. So, the opportunity of taking what properly belongs to Raam and giving it to Shaam is not just ever-present but even more likely.

Even simpler example is this. Suppose you are given 100 bricks and asked to build a structure on a given area for a wall on the ground. You can use them to arrange one big tower in the wall, two towers, whatever… There still would be in all 100 bricks sitting on the same area on the ground. The shapes may differ; the variational technique doesn’t care for the shape. Yet, realize, having accurate atomic orbitals means getting the shape of the wall too right, not just dumping 100 bricks on the same area.


6. Why waste time on yet another method, when a more accurate method has been around for some nine decades?

“OK, whatever” you might say at this point. “But if the variational technique was OK by Hylleraas, and if it’s been OK for the entire community of physicists for all these years, then why do you still want to waste your time and invent just another method that’s not as accurate anyway?”

My answer:

Firstly, my method isn’t an invention; it is a discovery. My calculation method directly follows the fundamental principles of physics through and through. Not a single postulate of the mainstream QM is violated or altered; I merely have added some further postulates, that’s all. These theoretical extensions fit perfectly with the mainstream QM, and using them directly solves the measurement problem.

Secondly, what I talked about was just an initial result, a very crude calculation. In fact, I have alrady improved the accuracy further; see below.

Thirdly, I must point out a possibility which your question didn’t at all cover. My point is that this actually isn’t an either-or situation. It’s not either variational technique (like Hylleraas’s) or mine. Indeed, it would very definitely be possible to incorporate the more accurate variational calculations as just parts of my own calculations too. It’s easy to show that. That would mean, combining “the best of both worlds”. At a broader level, the method would still follow my approach and thus be physically meaningful. But within carefully delimited scope, trial-functions could still be used in the calculation procedures. …For that matter, even FDM doesn’t represent any real physics either. Another thing: Even FDM can itself can be seen as just one—arguably the simplest—kind of a variational technique. So, in that sense, even I am already using the variational technique, but only the simplest and crudest one. The theory could easily make use of both meshless and mesh-requiring variational techniques.

I hope that answers the question.


7. A little more advanced simulation for the helium atom in a 3D box:

With my computational experience, I knew that I was going to get a good result, even if the actual result was only estimated to be about -2.1 a.u.—vs. -2.9033 a.u. for the experimentally determined value.

But rather than increasing accuracy for its own sake, on the 12th and 13th January, I came to focus more on improving the “basic infrastructure” of the technique.

Here, I now recalled the essential idea behind the Quantum Monte Carlo method, and proceeded to implement something similar in the context of my own approach. In particular, rather than going over the entire (discretized) configuration space, I implemented a code to sample only some points in it. This way, I could use bigger (i.e. more refined) meshes, and get better estimates.

I also carefully went through the logic used in the second phase, and corrected the errors.

Then, using a box of 35 a.u. and 71 nodes per side of the cube (i.e., 328,509 nodes in the interior region of the domain), and using just 1000 sampled nodes out of them, I now found that the bonding energy was -2.67 a.u. Quite satisfactory (to me!)


8. Finally, a word about the dirty tricks department:

I happened to observe that with some choices of physical box size and computational mesh size, the bonding energy could go as low as -3.2 a.u. or even lower.

What explains such a behaviour? There is this range of results right from -2.1 a.u. to -2.67 a.u. to -3.2 a.u. …Note once again, the actual figure is: -2.90 a.u.

So, the computational results aren’t only on the higher side or only on the lower side. Instead, they form a band of values on both sides of the actual value. This is both a good news and a bad news.

The good plus bad news is that it’s all a matter of making the right numerical choices. Here, I will mention only 2 or 3 considerations.

As one consideration, to get more consistent results across various domain sizes and mesh sizes, what matters is the physial distance represented by each cell in the mesh. If you keep this mind, then you can get results that fall in a narrow band. That’s a good sign.

As another consideration, the box size matters. In reality, there is no box and the wavefunction extends to infinity. But a technique like FDM requires having to use a box. (There are other numerical techniques that can work with infinite domains too.) Now, if you use a larger box, then the Coulomb well looks just like the letter `T’. No curvature is captured with any significance. With a lot of physical region where the PE portion looks relatively flat, the role played by the nuclear attraction becomes less significant, at least in numerical work. In short, the atom in a box approaches a free-particle-in-a-box scenario! On the other hand, a very small box implies that each electron is screening the nuclear potential at almost all times. In effect, it’s as if you are modelling a H- ion rather than an He atom!

As yet another consideration: The policy for choosing the depth of the potential energy matters. A concrete example might help.

Consider a 1D domain of, say, 5 a.u. Divide it using 6 nodes. Put a proton at the origin, and compute the electron’s PE. At the distance of 5 a.u., the PE is 1.0/5.0 = 0.2 a.u. At the node right next to singularity, the PE is 1 a.u. What finite value should you give to the PE be at the nucleus? Suppose, following the half-cell side rule, you give it the value of 1.0/0.5 = 2 a.u. OK.

Now refine the mesh, say by having 10 nodes going over the same physical distance. The physically extreme node retains the same value, viz. 0.2 a.u. But the node next to the singularity now has a PE of 1.0/0.5 = 2 a.u., and the half cell-side rule now gives the a value of 1.0/0.25 = 4.0 a.u. at the nucleus.

If you plot the two curves using the same scale, the differences are especially is striking. In short, mesh refinement alone (keeping the same domain size) has resulted in keeping the same PE at the boundary but jacking up the PE at the nucleus’ position. Not only that, but the PE field now has a more pronounced curvature over the same physical distance. Eigenvalue problems are markedly sensitive to the curvature in the PE.

Now, realize that tweaking this one parameter alone can make the simulation zoom on to almost any value you like (within a reasonable range). I can always choose this parameter in such a way that even a relatively crude model could come to reproduce the experimental value of -2.9 a.u. very accurately—for energy. The wavefunction may remain markedly jagged. But the energy can be accurate.

Every computational engineer/scientist understands such matters, especially those who work with singlarities in fields. For instance, all computational mechanical engineers know how the stress values can change by an order of magnitude or more, depending on how you handle the stress concentrators. Singularities form a hard problem of computational science & engineering.

That’s why, what matters in computational work is not only the final number you produce. What matters perhaps even more are such things as: Whether the method works well in terms of stability; trends in the accuracy values (rather than their absolute values); whether the method can theoretically accomodate some more advanced techniques easily or not; how it scales with the size of the domain and mesh refinement; etc.

If a method does fine on such counts, then the sheer accuracy number by itself does not matter so much. We can still say, with reasonable certainty, that the very theory behind the model must be correct.

And I think that’s what my yesterday’s result points to. It seems to say that my theory works.


9. To wind up…

Despite all my doubts, I always thought that my approach is going to work out, and now I know that it does—nay, it must!

The 3-dimensional \Psi fields can actually be seen to be pushing the particles, and the trends in the numerical results are such that the dynamical assumptions I introduced, for calculating the motions of the particles, must be correct too. (Another reason for having confidence in the numerical results is that the dynamical assumptions are very simple, and so it’s easy to think how they move the particles!) At the same time, though I didn’t implement it, I can easily see that the anti-symmetrical property of at least 2-particle system definitely comes out directly. The physical fields are 3-dimensional, and the configuration space comes out as a mathematical abstraction from them. I didn’t specifically implement any program to show detection probabilities, but I can see that they are going to come to right—at least for 2-particle systems.

So, the theory works, and that matters.

Of course, I will still have quite some work to do. Working out the remaining aspects of spin, for one thing. A three interacting-particles system would also be nice to work through and to simulate. However, I don’t know which system I could/should pick up. So, if you have any suggestions for simulating a 3-particle system, some well known results, then do let me know. Yes, there still are chances that I might still need to tweak the theory a little bit here and little bit there. But the basic backbone of the theory, I am now quite confident, is going to stand as is.

OK. One last point:

The physical fields of \Psi, over the physical 3-dimensional space, have primacy. Due to the normalization constraint, in real systems, there are no Dirac’s delta-like singularities in these wavefunctions. The singularities of the Coulomb field do enter the theory, but only as devices of calculations. Ontologically, they don’t have a primacy. So, what primarily exist are the aetherial, complex-valued, wavefunctions. It’s just that they interact with each other in such a way that the result is as if the higher-level V term were to have a singularity in it. Indeed, what exists is only a single 3-dimensional wavefunction; it is us who decompose it variously for calculational purposes.

That’s the ontological picture which seems to be emerging. However, take this point with the pinch of the salt; I still haven’t pursued threads like these; been too busy just implementing code, debugging it, and finding and comparing results. …


Enough. I will start writing the theory document some time in the second half of the next week, and will try to complete it by mid-February. Then, everything will become clear to you. The cleaned up and reorganized Python scripts will also be provided at that time. For now, I just need a little break. [BTW, if in my …err…“exuberance” online last night, if I have offended someone, my apologies…]

For obvious reasons, I think that I will not be blogging for at least two weeks…. Take care, and bye for now.


A song I like:

(Western, pop): “Lay all your love on me”
Band: ABBA

[A favourite since my COEP (UG) times. I think I looked up the words only last night! They don’t matter anyway. Not for this song, and not to me. I like its other attributes: the tune, the orchestration, the singing, and the sound processing.]


History:
— 2021.01.14 21:01 IST: Originally published
— 2021.01.15 16:17 IST: Very few, minor, changes overall. Notably, I had forgotten to type the powers of the terms in the illustrative polynomial for the trial function (in the section on variational methods), and now corrected it.