There’s something wrong about the diffusion equation—but what exactly is it?

As promised last time, let me try to give you a “layman’s” version of the trouble about the diffusion equation.

1. Physical Situations Involving Diffusion

First of all, we need some good physical situations that illustrate the phenomenon of diffusion, in particular, the simplest linear 1D diffusion equation:

\alpha \dfrac{\partial^2 u}{\partial x^2} = \dfrac{\partial u}{\partial t}

Here is a list of such models:

  • Think of a long, metal railing, which has got cold on a winter morning. [I said winter, and not December. No special treatment for Aussies and others from the southern hemisphere.]  Heat the mid-point of the railing using a candle or a soldering iron. The heat propagates in the rod, increasing temperatures at various points, which can be measured using thermocouples. Ignoring higher-order (wave/shocks) effects, the conduction of heat can be taken to follow the abovementioned simple diffusion equation.
  • Think of a container having two compartments separated by a wall which carries a small hole. The entire container is filled with air (say, 1 atm pressure at 25 degree Celsius), and then, an electromechanical shutter closes down the hole in the internal wall. Then, place an opened bottle of scent in one of the compartments, say, that on the left hand side. Allow for some time to elapse so that the scent spreads practically evenly everywhere in that compartment. (If you imagine having a fan in that compartment, you must also imagine it being switched off and the air-flow becoming stand-still on the macro-scale). Now, open the internal hole, and sense the strength of the scent at various points in the right-hand side compartment, at regular time-intervals. [I was being extra careful in writing this model, because the diffusion here can be directly modelled using the kinetic theory of gases.]
  • Take a kitchen sponge of fine porosity, and dip it into a bucket of water, thus letting it fully soak-in the water. Now, keep the sponge on a table. Take a flat piece of transparent glass, and place it vertically next to the sponge, touching it gently. Then, place a drop of ink at a point on the top surface of the sponge, right next to the glass. Observe the flow of ink through the sponge.

Even if this post is meant for “layman” engineers/physicists who have already studied this topic, I deliberately started with concrete physical examples. It helps freshen up the physical thinking better, and thereby, helps ground the mathematical thinking better. (I always believe that by way of logical hierarchy, the physical thought comes before the mathematical thought does. Before you can measure something, you have to know what it is that you are measuring; the what precedes the how.)

2. Mathematical Techniques Available to Solve the Diffusion Equation

Now, on to the mathematical techniques available to solve the above-mentioned diffusion equation. Here is a fairly comprehensive (even if perhaps not exhaustive) list of the usual techniques:

  • Spectral:
    • Analytical: The classical Fourier theory. Expand the initial condition in terms of a Fourier series (or, for an infinitely extended domain, a Fourier integral), and find the time evolution using separation of variables
    • Numerical: Discretize the domain and the initial condition, and also the time dimension. Use FFT to numerically compute the Fourier evolution. (If you are smart: chuck out the FFT implementation you wrote by yourself, and start using FFTW.)
  • Usual Numerical Methods:
    • FEM: Weak formulation.
    • FVM: Flux-conservation formulation
    • FDM: Based on the Taylor series expansion. For a 1D structured grid, it produces the same system as FEM.
  • The “Unusual” Numerical Method—the Local Finite Differences: Discretize the time-axis using the Taylor series expansion (as in FDM). On the space side, it’s slightly different from FDM. Check out p. 15 of Ref [1]. Practically speaking, almost none models the diffusion equation this way. However, we include it at this place to provide a neat progression in the nature of the techniques. If it helps, note that this technique essentially works as a CA (cellular automaton).
  • The Stochastic Methods
    • Brownian movement: By which, I mean, Einstein’s analysis of it; Ref. [3]. BTW, the original paper is surprisingly easy to understand. In fact, even the best textbook expositions of it (e.g. Huang’s Statistical Physics book) tend to drop a crucial noting made in the original paper. (In fact, even Einstein himself didn’t pay any further attention to it, right in the same paper. It was easier to spot it in the original paper. More on this, below, or later.)
    • The random walk (RW)/Monte Carlo (MC)/Numerical Brownian movement. For our limited purposes (focusing on the simple and the basic things), the three amount to one and the same thing.

The Solution Techniques and the Issue of the Instantaneous Action at a Distance

Now go over the list again, this time figuring out on which side each technique falls, what basic premise it (implicitly or explicitly) assumes: does it fall on the side of a compact support for the solution, or not. Here is the quick run-down:

All the spectral methods and all the usual numerical methods involve solution support extended over the entire domain (finite or infinite). The unusual numerical method of local finite differences involves a compact support. The traditional analysis of Brownian movement is confused about the issue. In contrast, what the numerical techniques of random walk/MC implement is a compact support.

Go over the list again, and make sure you are comfortable with the characterizations. You should be. Except for my assertion that the traditional analysis of Brownian movement is confused.

To explain the confusion, we have to go to Ref. [1] again.

In Ref. [1], on p. 16, the author states that:

“… A simple argument shows that if h^2/\tau \rightarrow 0 or +\infty, x may approach +\infty in finite time, which is physically untenable.”

However, in the same Ref [1], on p. 2 in fact, the author has already stated that:

“It is easily verified that u(t,x) = \dfrac{1}{(2 \pi t)^{d/2}} \exp(-\dfrac{|x|^2}{2 t}) satisfies [the above-mentioned diffusion equation.]”

Here, the author does not provide commentary on the nature of the solution, as far as the issue of IAD is concerned.

For a commentary on the nature of solution, we here make reference to [2], which, on p. 46, simply declares (without a prior or later discussion of the logical antecedents or context, let alone a proof for the declaration in question) that the function \dfrac{1}{(4\pi t)^{n/2}} e^{- \dfrac{|x|^2}{4 t}} (where n is the dimensionality of space) is the fundamental solution to the diffusion equation; and then, on p. 56, goes on to invoke the strong maximum principle to assert infinite speed of propagation—which is contradictory to the above-quoted passage in Ref [1], of course, but notice that the solutions being quoted is the same.

BTW, the strong maximum principle suspiciously looks as if its native place is the harmonic analysis (which is just another [mathematicians’] name for the Fourier theory). And, this turns out to be true. [^]

So, back to square one. Nice circularity: You first begin with spectral decomposition that first posits domain-wide support for each eigenfunction; you then multiply each eigenfunction by its time-decay term and add the products together so as to get the time evolution predicted by the separation of variables in the diffusion process; and then, somewhere down the line, you allow yourself to be wonder-struck; you declare: wow! There is action at a distance in the diffusion equation, after-all!

Ok, that’s not a confusion, you might say. It’s just a feature of the Fourier theory. But where is the confusion concerning the Brownian movement which you promised us, you might want to ask at this point.

The Traditional Analysis of the Brownian Movement as Confused w.r.t. IAD

Well, the confusion concerning the Brownian movement is this:

Refer to Einstein’s 1905 paper. In section 4 (“On the irregular movement…”) he says this much:

“Suppose there are altogether n particles suspended in a liquid. In an interval of time \tau the x-Co-ordinates of the single particles will increase by \Delta, where \Delta has a different value (positive or negative) for each particle. For the value of \Delta a certain probability-law will hold; the number dn of the particles which experience in the time interval \tau a displacement which lies between \Delta and \Delta + d\Delta, will be expressed by an equation of the form

dn = n \phi(\Delta) d\Delta


\int_{-\infty}^{+\infty} \phi(\Delta) d\Delta = 1

and \phi only differs from zero for very small values of \Delta and fulfils the condition

\phi(\Delta) = \phi( - \Delta) .

We will investigate now how the coefficient of diffusion depends on \phi, confining ourselves again to the case when the number \nu of the particles per unit volume is dependent only on x and t.

Putting for the particles per unit volume \nu = f(x, t), we will calculate the distribution of the particles at a time t + \tau from the distribution at the time t. From the definition of the function \phi(\Delta), there is easily obtained the number of the particles which are located at the time t + \tau between two planes perpendicular to the x-axis, with abscissae x and x + dx. We get

f(x, t + \tau) dx = dx \cdot \int_{\Delta = -\infty}^{\Delta = +\infty} f(x + \Delta) \phi(\Delta) d\Delta


… we get …]

\dfrac{\partial f}{\partial t} = D \dfrac{\partial^2 f}{\partial x^2} (I)

This is the well known differential equation for diffusion…”

[Bold emphasis mine.]

In the same paper, Einstein then goes on to say the following:

“Another important consideration can be related to this method of development. We have assumed that the single particles are all referred to the same Co-ordinate system. But this is unnecessary, since the movements of the single particles are mutually independent. We will now refer the motion of each particle to a Co-ordinate system whose origin coincides at the time t = 0 with the position of the centre of gravity of the particles in question; with this difference, that f(x, t)dx now gives the number of the particles whose x Co-ordinate has increased between the time t = 0 and the time t = t, by a quantity which lies between x and x + dx. In this case also the function f must satisfy, in its changes, the equation (I). Further, we must evidently have for $x > or < 0$ and t = 0,

f(x,t) = 0 and \int_{-\infty}^{+\infty} f(x,t) dx = n .

The problem, which accords with the problem of the diffusion outwards from a point (ignoring possibilities of exchange between the diffusing particles) is now mathematically completely defined [his Ref 9]; the solution is:

f(x,t) = \dfrac{n}{4 \pi D} \dfrac{e^{-\frac{x^2}{4Dt}}}{\sqrt{t}}

The probable distribution of the resulting displacements in a given time t is therefore the same as that of fortuitous error, which was to be expected.”

[Bold emphasis mine]

Contrast the bold portions in the above two passages from Einstein’s paper. Both the passages come from the same section within the paper! The first passage assumes a probability distribution function (PDF) that has compact support, and proceeds, correctly, to derive the diffusion equation. The second passage reiterates that the PDF must obey the same diffusion equation, but proceeds to quote a “known” solution that has x spread all over an infinite domain, thereby simply repeating the error. … To come so close to the truth, and then to lose it all!

Well, you can say: “Wait a minute! He changed the meaning of f somewhere along, didn’t he?”

You are right. He did. In the first passage, f referred to the PDF of particles density at various locations x; in the second passage, it refers to the PDF of particles undergoing various amounts of displacements from their current positions. The difference hardly matters. In either case, if you do not qualify x variable in any way, and in fact quote the earlier, infinite-domain result for the diffusion, you implicitly adopt the position that the PDF is extended to \infty. You thereby end up getting IAD (instantaneous action at a distance) back into the game.

This back-and-forth jumping of positions concerning compactness of support (or IAD) is exactly what Ref [1] also engages in, as we saw above. The difference is that, once in the stochastic context, the Ref [1] is at least explicit in identifying the infinite speed of propagation and denying it a physical tenability. Even though, by admitting the classical solution, it must make an inadvertent jump back to the IAD game!

In contrast, the issue is very clear to see in case of the numerical methods—even if no one discusses IAD in their contexts! The most spectacular failure of the successive authors, IMO, is their failure to distinguish between the local finite differences and the usual FDM. If you grasp this part, the rest everything becomes much more easy to follow. After all, you get the random walk simply out of randomizing the same local-propagational process which is finitely discretized in the local finite differences technique. The difference between RW/MC on the one hand and FDM/FEM/FVM on the other, is not just the existence or otherwise of  randomness; it also is: the compactness of the solution support. … I wish I had the time (or at least the inclination) to implement both these techniques and illustrate the time evolution via some nice graphics. For the time being at least, the matter is left to the reader’s imagination and/or implementation. Here, let me touch on one last point.

Why This Kolaveri Confusion, Di?

Mathematicians are not idiots. [LOL!] If so, what could possibly the reason as to why a matter this “simple” has not been “caught” or “got” or highlighted by any single mathematician so far—or a mathematical physicist, for that matter? Why do people adopt one mind-set, capable of denying IAD, when in the stochastic realm, and immediately later on, adopt another mind-set, that explicitly admits IAD? Why? Any clue? Do you have any clue regarding this above question? Can you figure out the reason why? Give it your honest try. As to me, I think I know the answer—at least, I do have a clue which looks pretty decent to me. … And, as indicated above, the answer is not in the nature of the change of mind-set when people approach the problems they regard as “deterministic” vs. the problems they regard as “probabilistic” or “stochastic.” It’s not that…. It’s something different.

Do give it a try, but I also think that it will probably be hard for you to get to the same answer as mine. (Even though, I also think that you will accept my answer as a valid one, when you get to know it.) And the reason why it will be hard for you—or at least it will be so for most people—is that most people don’t think that physics precedes mathematics, but I do. If only you can change that hierarchy, the path to the answer will become much much easier. A whole lot easier.

That precisely is the reason why I included the very first section in this post. It doesn’t just sit there without any purpose. It’s there to help give you a context. Mathematics requires physics for its context. Anyway, I don’t want to overstretch this point. It’s not very important.

Knowing for a fact that two classes of theories, speaking about the same mathematical equation which has been studied for a couple of centuries, but have completely different things to say when it comes to an important issue like IAD—that is important.

Important, as from the quantum mechanics viewpoint. After all, check out p. 4 of Ref [2]. It lists the Schrödinger equation right after the diffusion equation. And while at that page, notice also the similarity and differences between the two equations, stripped down (i.e. suitably scaled and specialized) to their bare essences.

Resolving the riddles of the quantum entanglement is as close as the Schrödinger equation is to the heat equation—and then as close as resolving the confusions concerning IAD is, in the context of the diffusion equation.

… We must know why physicists and mathematicians have noted the two faces of the diffusion equation, but have remained confused about it. … Think about it.

May be another post on this entire topic, some time later, probably sooner than later, giving you my answer to the above question. In the meanwhile, remember to let me know if you can give any additional information/answer to my Maths StackExchange question on this topic [^]. BTW, thanks are due to “Pavel M” from at that forum, for pointing out Evans’ book to me. I didn’t about it, and it seems a good reference to quote.


[1] Varadhan, S. R. S. (1989) “Lectures on Diffusion Problems and Partial Differential Equations,” Notes taken by Pl. Muthuramalingam and Tara R. Nanda, TIFR, Springer-Verlag.
[2] Evans, Lawrence C. (2010) “Partial Differential Equations, 2/e,” Graduate Studies in Mathematics, v. 19, American Mathematical Society
[3] Einstein, A. (1905) “On the movement of small particles suspended in stationary liquids required by the molecular-kinetic theory of heat,” Annalen der Physik, v. 17, pp. 549–560. [English translation (1956) by A. D. Cowper, in “Investigations on the Theory of Brownian Movement,” Dover]

* * * * *   * * * * *   * * * * *

[This post sure would do with a couple of edits in the near future, though the essential ideas are all already there. TBD: I have to think whether to add the “Song I Like” section or not. Sometime later. It already is almost 2700 words, with many latex equations. … May be tomorrow or the day after…]


4 thoughts on “There’s something wrong about the diffusion equation—but what exactly is it?

  1. a_scientist, regarding your answer “Wave equation: local disturbance propogates at finite speed; diffusion: local disturbance gets smeared out”:Could one say that the maximal of the diffusion is the average/mean, and that with time the mean decrease as the standard deviation increases? ( assuming that the boundary temperature/concentration is constant)It is interesting that since there is no derivative with respect to time in the diffusion equation that it most act on all space (including infinity)instantaneously.Could one say that when there is a derivative with respect to time that energy is stored and relased as time progresses?

    [Admin note: IP address:]

  2. So, back to square one. Nice circularity: You first begin with spectral decomposition that first posits domain-wide support for each eigenfunction; you then multiply each eigenfunction by its time-decay term and add the products together so as to get the time evolution predicted by the separation of variables in the diffusion process; and then, somewhere down the line, you allow yourself to be wonder-struck; you declare: wow! There is action at a distance in the diffusion equation, after-all!

    [Admin note: IP Address:]

  3. Pingback: More on the features of the Fourier theory | Ajit Jadhav's Weblog

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.