BTW, remember: (i) this blog is in copyright, (ii) your feedback is welcome.
A song I like:
(Hindi) “agar main kahoon”
Lyrics: Javed Akhtar
Singers: Alka Yagnik, Udit Narayan
Links, and my comments:
The “pride of place” for this post goes to a link to this book:
Norsen, Travis (2017) “Foundations of Quantum Mechanics: An Exploration of the Physical Meaning of Quantum Theory,” Springer
This book is (i) the best supplementary book for a self-study of QM, and simultaneously, also (ii) the best text-book on a supplementary course on QM, both at the better-prepared UG / beginning PG level.
A bit expensive though, but extensive preview is available on Google books, here [^]. (I plan to buy it once I land a job.)
I was interested in the material from the first three chapters only, more or less. It was a delight even just browsing through these chapters. I intend to read it more carefully soon enough. But even on the first, rapid browsing, I noticed that several pieces of understanding that I had so painstakingly come to develop (over a period of years) are given quite straight-forwardly here, as if they were a matter of well known facts—even if other QM text-books only cursorily mention them, if at all.
For instance, see the explanation of entanglement here. Norsen begins by identifying that there is a single wavefunction, always—even for a multi-particle system. Then after some explanation, he states: “But, as usual in quantum mechanics, these states do not exhaust the possibilities—instead, they merely form a basis for the space of all possible wave functions. …”… Note the emphasis on the word “basis” which Norsen helpfully puts.
Putting this point (which Norsen discusses with a concrete example), but in my words: There is always a single wavefunction, and for a multi-particle system, its basis is bigger; it consists of the components of the tensor product (formed from the components of the basis of the constituent systems). Sometimes, the single wavefunction for the multi-particle system can be expressed as a result of a single tensor-product (in which case it’s a separable state), and at all other times, only as an algebraic sum of the results of many such tensor-products (in which case they all are entangled states).
Notice how there is no false start of going from two separate systems, and then attempting to forge a single system out of them. Notice how, therefore, there is no hand-waving at one electron being in one galaxy, and another electron in another galaxy, and so on, as if to apologize for the very idea of the separable states. Norsen achieves the correct effect by beginning on the right note: the emphasis on the single wavefunction for the system as a whole to begin with, and then clarifying, at the right place, that what the tensor product gives you is only the basis set for the composite wavefunction.
There are many neat passages like this in the text.
I was about to say that Norsen’s book is the Resnick and Halliday of QM, but then came to hesitate saying so, because I noticed something odd even if my browsing of the book was rapid and brief.
Then I ran into
Ian Durham’s review of Norsen’s book, at the FQXi blog,
which is our link # 2 for this post [^].
Durham helpfully brings out the following two points (which I then verified during a second visit to Norsen’s book): (i) Norsen’s book is not exactly at the UG level, and (ii) the book is a bit partial to Bell’s characterization of the quantum riddles as well as to the Bohmian approach for their resolution.
The second point—viz., Norsen’s fascination for / inclination towards Bell and Bohm (B&B for short)—becomes important only because the book is, otherwise, so good: it carries so many points that are not even passingly mentioned in other QM books, is well written (in a conversational style, as if a speech-to-text translator were skillfully employed), easy to understand, thorough, and overall (though I haven’t read even 25% of it, from whatever I have browsed), it otherwise seems fairly well balanced.
It is precisely because of these virtues that you might come out giving more weightage to the B&B company than is actually due to them.
Keep that warning somewhere at the back of your mind, but do go through the book anyway. It’s excellent.
At Amazon, it has got 5 reader reviews, all with 5 stars. If I were to bother doing a review there, I too perhaps would give it 5 stars—despite its shortcomings/weaknesses. OK. At least 4 stars. But mostly 5 though. … I am in an indeterminate state of their superposition.
… But mark my words. This book will have come to shape (or at least to influence) every good exposition of (i.e. introduction to) the area of the Foundations of QM, in the years to come. [I say that, because I honestly don’t expect a better book on this topic to arrive on the scene all that soon.]
Which brings us to someone who wouldn’t assign the stars to this book. Namely, Lubos Motl.
If Norsen has moved in the Objectivist circles, and is partial to the B&B company, Motl has worked in the string theory, and is not just partial to it but even today defends it very vigorously—and oddly enough, also looks at that “supersymmetric world from a conservative viewpoint.” More relevant to us: Motl is not partial to the Copenhagen interpretation; he is all the way into it. … Anyway, being merely partial is something you wouldn’t expect from Motl, would you?
But, of course, Motl also has a very strong grasp of QM, and he displays it well (even powerfully) when he writes a post of the title:
“Postulates of quantum mechanics almost directly follow from experiments.” [^]
Err… Why “almost,” Lubos? 🙂
… Anyway, go through Motl’s post, even if you don’t like the author’s style or some of his expressions. It has a lot of educational material packed in it. Chances are, going through Motl’s posts (like the present one) will come to improve your understanding—even if you don’t share his position.
As to me: No, speaking from the new understanding which I have come to develop regarding the foundations of QM [^] and [^], I don’t think that all of Motl’s objections would carry. Even then, just for the sake of witnessing the tight weaving-in of the arguments, do go through Motl’s post.
Finally, a post at the SciAm blog:
“Coming to grips with the implications of quantum mechanics,” by Bernardo Kastrup, Henry P. Stapp, and Menas C. Kafatos, [^].
The authors say:
“… Taken together, these experiments [which validate the maths of QM] indicate that the everyday world we perceive does not exist until observed, which in turn suggests—as we shall argue in this essay—a primary role for mind in nature.”
No, it didn’t give me shivers or something. Hey, this is QM and its foundations, right? I am quite used to reading such declarations.
Except that, as I noted a few years ago on Scott Aaronson’s blog [I need to dig up and insert the link here], and then, recently, also at
Roger Schlafly’s blog [^],
you don’t need QM in order to commit the error of inserting consciousness into a physical theory. You can accomplish exactly the same thing also by using just the Newtonian particle mechanics in your philosophical arguments. Really.
Yes, I need to take that reply (at Schlafly’s blog), edit it a bit and post it as a separate entry at this blog. … Some other time.
For now, I have to run. I have to continue working on my approach so that I am able to answer the questions raised and discussed by people such as those mentioned in the links. But before that, let me jot down a general update.
A general update:
Oh, BTW, I have taken my previous QM-related post off the top spot.
That doesn’t mean anything. In particular, it doesn’t mean that after reading into materials such as that mentioned here, I have found some error in my approach or something like that. No. Not at all.
All it means is that I made it once again an ordinary post, not a sticky post. I am thinking of altering the layout of this blog, by creating a page that highlights that post, as well as some other posts.
But coming back to my approach: As a matter of fact, I have also written emails to a couple of physicists, one from IIT Bombay, and another from IISER Pune. However, things have not worked out yet—things like arranging for an informal seminar to be delivered by me to their students, or collaborating on some QM-related simulations together. (I could do the simulations on my own, but for the seminar, I would need an audience! One of them did reply, but we still have to shake our hands in the second round.)
In the meanwhile, I go jobless, but I keep myself busy. I am preparing a shortish set of write-ups / notes which could be used as a background material when (at some vague time in future) I go and talk to some students, say at IIT Bombay/IISER Pune. It won’t be comprehensive. It will be a little more than just a white-paper, but you couldn’t possibly call it even just the preliminary notes for my new approach. Such preliminary notes would come out only after I deliver a seminar or two, to physics professors + students.
At the time of delivering my proposed seminar, links like those I have given above, esp. Travis Norsen’s book, also should prove a lot useful.
But no, I haven’t seen something like my approach being covered anywhere, so far, not even Norsen’s book. There was a vague mention of just a preliminary part of it somewhere on Roger Schlafly’s blog several years ago, only once or so, but I can definitely say that I had already had grasped even that point on my own before Schlafly’s post came. And, as far as I know, Schlafly hasn’t come to pursue that thread at all, any time later…
But speaking overall, at least as of today, I think I am the only one who has pursued this (my) line of thought to the extent I have [^].
So, there. Bye for now.
I Song I Like:
(Hindi) “suno gajar kya gaaye…”
Singer: Geeta Dutt
Music: S. D. Burman
Lyrics: Sahir Ludhianvi
[There are two Geeta’s here, and both are very fascinating: Geeta Dutt in the audio, and Geeta Bali in the video. Go watch it; even the video is recommended.]
As usual, some editing after even posting, would be inevitable.
Some updates made and some streamlining done on 30 July 2018, 09:10 hrs IST.
A Special Note (added on 17th June 2018): This post is now a sticky post; it will remain, for some time, at the top of this blog.
I am likely to keep this particular post at the top of this blog, as a sticky post, for some time in the future (may be for a few months or so). So, even if posts at this blog normally appear in the reverse chronological order, any newer entries that I may post after this one would be found below this one.
[In particular, right now, I am going through a biography: “Schrodinger: Life and Thought” by Walter Moore [^]. I had bought this book way back in 2011, but had to keep it aside back then, and then, somehow, I came to forget all about it. The book surfaced during a recent move we made, and thus, I began reading it just this week. I may write a post or two about it in the near future (say within a couple of weeks or so) if something strikes me while I am at it.]
A Yawningly Long Preamble:
[Feel free to skip to the sections starting with the “Statement 1” below.]
As you know, I’ve been thinking about foundations of QM for a long, long time, a time running into decades by now.
I thought a lot about it, and then published a couple of papers during my PhD, using a new approach which I had developed. This approach was used for resolving the wave-particle duality, but only in the context of photons. However, I then got stuck when it came to extending and applying this same approach to electrons. So, I kept on browsing a lot of QM-related literature in general. Then, I ran, notably, into the Nobel laureate W. E. Lamb’s “anti-photon” paper [^], and also the related literature (use Google Scholar). I thought a lot about this paper—and also about QM. I began thinking about QM once again from the scratch, so to speak.
Eventually, I came to abandon my own PhD-time approach. At the same time, with some vague but new ideas already somewhere at the back of my mind, I once again started studying QM, once again with a fresh mind, but this time around much more systematically. …
… In the process, I came to develop a completely new understanding of QM!… It’s been at least months since I began talking about it [^]. … My confidence in this new understanding has only increased, since then.
Today’s post will be based on this new understanding. (I could call it a new theory, perhaps.)
My findings suggest a few conclusions which I think I should not hold back any longer. Hence this post.
I have been trying to locate the right words for formulating my conclusions—but without much satisfaction. Finally, I’ve decided to go ahead and post an entry here anyway, regardless of whether the output comes out as being well formulated or not.
In other words, don’t try to pin me down with the specific words I use here in this post! Instead, try to understand what I am trying to get at. In still other words: the particular words I use may change, but the intended meaning will, from now on, “always” remain the same—ummm…. more or less the same!
OK, so here are the statements I am making today. I think they are well defensible:
Statement 1: It is possible to explain all quantum mechanical phenomena on the basis of those principles which are already known (or have already been developed) in the context of classical mechanics.
Informal Explanation 1.1: Statement 1 holds true. It’s just that when it comes to explaining the QM phenomena (i.e., when it comes to supplying a physical mechanism for QM), even if the principles do remain the same, the way they are to be combined and applied is different. These differences basically arise because of a reason mentioned in the next Informal Explanation.
Informal Explanation 1.2: Yes, the tradition of 80+ years, involving an illustrious string of Nobel laureates and others, is, in a way, “wrong.” The QM principles are not, fundamentally speaking, very different from those encountered in the CM. It’s just that some of the objects that QM assumes and talks about are different (only partly different) from those assumed in the CM.
Corollary 1 of Statement 1: A quantum computer could “in principle” be built as an “application layer” on top of the “OS platform” supplied by the classical mechanics.
Informal Explanation 1.C1.1: Hierarchically speaking, QM remains the most fundamental or the “ground” layer. The aspects of the physical reality that CM refers to, therefore, indeed are at a layer lying on top of QM. This part does continue to remain the same.
However, what the Corollary 1 now says is that you can also completely explain the workings of QM in terms of a virtual QM machine that is built on top of the well-known principles of CM.
If someone builds a QC on such a basis (which would be a virtual QC on top of CM), then it would be just a classical mechanically functioning simulator—an analog simulator, I should add—that simulates the QM phenomena.
Informal Explanation 1.C1.2: The phrase “in principle” does not always translate into “easily.” In this case, it in factt is very easily possible that building a big enough a QC of this kind (i.e. the simulating QC) may very well turn out to be an enterprise that is too difficult to be practically feasible.
Corollary 2 of Statement 1: A classical system can be designed in such a way that it shows all the features of the phenomenon of quantum entanglement (when the classical system is seen from an appropriately high-level viewpoint).
Informal Explanation 1.C2.1: There is nothing “inherently quantum-mechanical” about entanglement. The well-known principles of CM are enough to explain the phenomena of entanglement.
Informal Explanation 1.C2.2: We use our own terms. In particular, when we say “classical mechanics,” we do not mean these words in the same sense in which a casual reader of the QM literature, e.g. of Bell’s writings, may read them.
What we mean by “classical mechanics” is the same as what an engineer who has never studied QM proper means, when he says “classical mechanics” (i.e., the Newtonian mechanics + the Lagrangian and Hamiltonian reformulations including variational principles, as well as the more modern developments such as studies of nonlinear systems and the catastrophe theory).
Statement 2: It can be shown that even if the Corollary 1 above does hold true, the kind of quantum computer it refers to would be such that it will not be able to break a sufficiently high-end RSA encryption (such as what is used in practice today, at the high-end).
Aside 2.1: I wouldn’t have announced Statement 1 unless I was sure—absolutely goddamn sure, in fact—about the Statement 2. In fact, I must have waited for at least half a year just to make sure about this aspect, looking at these things from this PoV, then from that PoV, etc.
Statement 3: Inasmuch as the RSA-beating QC requires a controlled entanglement over thousands of qubits, it can be said, on the basis of the new understanding (the one which lies behind the Statement 1 above), that the goal of achieving even “just” the quantum supremacy seems highly improbable, at least in any foreseeable future, let alone achieving the goal of breaking the high-end RSA encryption currently in use. However, proving these points, esp. that the currently employed higher-end RSA cannot be broken, will require further development of the new theory, particularly a quantitative theory for the mechanism(s) involved in the quantum mechanical measurements.
Informal Explanation 3.1: A lot of funding has already gone into attempts to build a QC. Now, it seems that the US government, too, is considering throwing some funds at it.
The two obvious goal-posts for a proper QC are: (i) first gaining enough computational power to run past the capabilities of the classical digital computers, i.e., achieving the so-called “quantum supremacy,” and then, (ii) breaking the RSA encryption as is currently used in the real-world at the high-end.
The question of whether the QC-related researches will be able to achieve these two goals or not depends on the question of whether there are natural reasons/causes which might make it highly improbable (if not outright impossible) to achieve these two goals.
We have already mentioned that it can be shown that it will not be possible for a classical (analog) quantum simulator (of the kind we have in mind) to break the RSA encryption.
Thus, we have already made a conclusive statement about this combination of a QC and a goal-post:
We have said that it can be shown (i.e. proved) that the above combination would be impossible to have. (The combination is that extreme.)
However, it still leaves open 3 more combinations of a QC and a goal-post:
As of today, a conclusive statement cannot be made regarding the last three combinations, not even on the basis of my newest approach to the quantum phenomena, because the mathematical aspects which will help settle questions of this kind, have not yet been developed (by me).
Chances are good that such a theory could be developed, at least in somewhat partly-qualitative-and-partly-quantitative terms, or in terms of some quantitative models that are based on some good analogies, sometime in the future (say within a decade or so). It is only when such developments do occur that we will be able to conclusively state something one way or the other in respect of the last three combinations.
However, relying on my own judgment, I think that I can safely state this much right away: The remaining three combinations would be tough, very tough, to achieve. The last combination, in particular, is best left aside, because the combination is far too complex that it can pose any real threat, at least as of today. I can say this much confidently—based on my new approach. (If you have some other basis to feel confident one way or the other, kindly supply the physical mechanism for the same, please, not just “math.”)
So, as of today, the completely defensible statements are the Statement No. 1 and 2 (with all their corollaries), but not the Statement 3. However, a probabilistic judgment for the Statement 3 has also been given.
A short (say, abstract-like) version:
A physical mechanism to explain QM phenomena has been developed, at least in the bare essential terms. It may perhaps become possible to use such a knowledge to build an analog simulator of a quantum computer. Such a simulator would be a machine based only on the well-known principles of classical mechanics, and using the kind of physical objects that the classical mechanics studies.
However, it can also be easily shown that such a simulator will not be able to break the RSA encryption using algorithm such as Shor’s. The proof rests on an idealized abstraction of classical objects (just the way the ideal fluid is an abstraction of real fluids).
On the basis of the new understanding, it becomes clear that trying to break RSA encryption using a QC proper (i.e. a computer that’s not just a simulator, but is a QC proper that directly operates at the level of the QM platform itself) would be a goal that is next to impossible to achieve. In fact, even achieving just the “quantum supremacy” (i.e., beating the best classical digital computer) itself can be anticipated, on the basis of the new understanding, as a goal that would be very tough to achieve, if at all.
Researches that attempt to build a proper QC may be able to bring about some developments in various related areas such as condensed matter physics, cryogenics, electronics, etc. But it is very highly unlikely that they would succeed in achieving the goal of quantum supremacy itself, let alone the goal of breaking the RSA encryption as it is deployed at the high-end today.
A Song I Like:
(Hindi) “dilbar jaani, chali hawaa mastaanee…”
Music: Laxmikant Pyarelal
Singers: Kishore Kumar, Lata Mangeshkar
Lyrics: Anand Bakshi
Note that, as is usual at this blog, an iterative improvement of the draft is always a possibility. Done.
This post directly continues from my last post. The content here was meant to be an update to my last post, but it grew, and so, I am noting it down as a separate post in its own right.
Thought about it [I mean my last post] a lot last night and this morning. I think here is a plan of action I can propose:
I can deliver a smallish, informally conducted, and yet, “official” sort of a seminar/talk/guest lecture, preferably at an IIT/IISER/IISc/similar institute. No honorarium is expected; just arrange for my stay. (That too is not necessary if it will be IIT Bombay; I can then stay with my friend; he is a professor in an engineering department there.)
Once arranged by mutual convenience, I will prepare some lecture notes (mostly hand-written), and deliver the content. (I guess at this stage, I will not prepare Beamer slides, though I might include some audio-visual content such as simulations etc.)
Questions will be OK, even encouraged, but the format will be that of a typical engineering class-room lecture. Discussions would be perfectly OK, but only after I finish talking about the “syllabus” first.
The talk should preferably be attended also by a couple of PhD students or so (of physics/engineering physics/any really relevant discipline, whether it’s acknowledged as such by UGC/AICTE or not). They should separately take down their notes and show me these later. This will help me understand where and how I should modify my notes. I will then myself finalize my notes, perhaps a few days after the talk, and send these by email. At that stage, I wouldn’t mind posting the notes getting posted on the ‘net.
Guess I will think a bit more about it, and note about my willingness to deliver the talk also at iMechanica. The bottom-line is that I am serious about this whole thing.
A few anticipated questions and their answers (or clarifications):
More, may be later. I will sure note my willingness to deliver a seminar at an IIT (or at a good University department) or so, at iMechanica also, soon enough. But right now I don’t have the time, and have to rush out. So let me stop here. Bye for now, and take care… (I would add a few more tags to the post-categories later on.)
0. I’ve been too busy in my day-job to write anything at any one of my blogs, but recently, a couple of things happened.
1. I wrote what I think is a “to read” (if not a “must read”) comment, concerning the important issue of causality, at Roger Schlafly’s blog; see here [^]. Here’s the copy-paste of the same:
1. There is a very widespread view among laymen, and unfortunately among philosophers too, that causality requires a passage of time. As just one example: In the domino effect, the fall of one domino leads to the fall of another domino only after an elapse of time.
In fact, all their examples wherever causality is operative, are of the following kind:
“If something happens then something else happens (necessarily).”
Now, they interpret the word `then’ to involve a passage of time. (Then, they also go on to worry about physics equations, time symmetry, etc., but in my view all these are too advanced considerations; they are not fundamental or even very germane at the deepest philosophical level.)
2. However, it is possible to show other examples involving causality, too. These are of the following kind:
“When something happens, something else (necessarily) happens.”
Here is an example of this latter kind, one from classical mechanics. When a bat strikes a ball, two things happen at the same time: the ball deforms (undergoes a change of shape and size) and it “experiences” (i.e. undergoes) an impulse. The deformation of the ball and the impulse it experiences are causally related.
Sure, the causality here is blatantly operative in a symmetric way: you can think of the deformation as causing the impulse, or of the impulse as causing the deformation. Yet, just because the causality is symmetric here does not mean that there is no causality in such cases. And, here, the causality operates entirely without the dimension of time in any way entering into the basic analysis.
Here is another example, now from QM: When a quantum particle is measured at a point of space, its wavefunction collapses. Here, you can say that the measurement operation causes the wavefunction collapse, and you can also say that the wavefunction collapse causes (a definite) measurement. Treatments on QM are full of causal statements of both kinds.
3. There is another view, concerning causality, which is very common among laymen and philosophers, viz. that causality necessarily requires at least two separate objects. It is an erroneous view, and I have dealt with it recently in a miniseries of posts on my blog; see https://ajitjadhav.wordpress.com/2017/05/12/relating-the-one-with-the-many/.
4. Notice, the statement “when(ever) something happens, something else (always and/or necessarily) happens” is a very broad statement. It requires no special knowledge of physics. Statements of this kind fall in the province of philosophy.
If a layman is unable to think of a statement like this by way of an example of causality, it’s OK. But when professional philosophers share this ignorance too, it’s a shame.
5. Just in passing, noteworthy is Ayn Rand’s view of causality: http://aynrandlexicon.com/lexicon/causality.html. This view was basic to my development of the points in the miniseries of posts mentioned above. … May be I should convert the miniseries into a paper and send it to a foundations/philosophy journal. … What do you think? (My question is serious.)
Thanks for highlighting the issue though; it’s very deeply interesting.
3. The other thing is that the other day (the late evening of the day before yesterday, to be precise), while entering a shop, I tripped over its ill-conceived steps, and suffered a fall. Got a hairline crack in one of my toes, and also a somewhat injured knee. So, had to take off from “everything” not only on Sunday but also today. Spent today mostly
sleeping relaxing, trying to recover from those couple of injuries.
This late evening, I naturally found myself recalling this song—and that’s where this post ends.
4. OK. I must add a bit. I’ve been lagging on the paper-writing front, but, don’t worry; I’ve already begun re-writing (in my pocket notebook, as usual, while awaiting my turn in the hospital’s waiting lounge) my forth-coming paper on stress and strain, right today.
OK, see you folks, bye for now, and take care of yourselves…
A Song I Like:
(Hindi) “zameen se hamen aasmaan par…”
Singer: Asha Bhosale and Mohammad Rafi
Music: Madan Mohan
Lyrics: Rajinder Krishan
Update on 18th June 2017:
See the update to the last post; I have added three more diagrams depicting the mathematical abstraction of the problem, and also added a sub-question by way of clarifying the problem a bit. Hopefully, the problem is clearer and also its connection to QM a bit more apparent, now.
Here I partly expand on the problem mentioned in my last post [^]. … Believe me, it will take more than one more post to properly expand on it.
The expansion of an expanding function refers to and therefore requires simultaneous expansions of the expansions in both the space and frequency domains.
The said expansions may be infinite [in procedure].
In the application of the calculus of variations to such a problem [i.e. like the one mentioned in the last post], the most important consideration is the very first part:
Among all the kinematically admissible configurations…
[You fill in the rest, please!]
A Song I Like:
I shall expand on this bit a bit later on. Done, right today, within an hour.]
(Hindi) “goonji see hai, saari feezaa, jaise bajatee ho…”
Music: Shankar Ahasaan Loy
Singers: Sadhana Sargam, Udit Narayan
Lyrics: Javed Akhtar
Update on 18 June 2017:
Added three diagrams depicting the mathematical abstraction of the problem; see near the end of the post. Also added one more consideration by way of an additional question.
TL;DR: A very brief version of this post is now posted at iMechanica; see here [^].
How I happened to come to formulate this problem:
As mentioned in my last post, I had started writing down my answers to the conceptual questions from Eisberg and Resnick’s QM text. However, as soon as I began doing that (typing out my answer to the first question from the first chapter), almost predictably, something else happened.
Since it anyway was QM that I was engaged with, somehow, another issue from QM—one which I had thought about a bit some time ago—happened to now just surface up in my mind. And it was an interesting issue. Back then, I had not thought of reaching an answer, and even now, I realized, I had not very satisfactory answer to it, not even in just conceptual terms. Naturally, my mind remained engaged in thinking about this second QM problem for a while.
In trying to come to terms with this QM problem (of my own making, not E&R’s), I now tried to think of some simple model problem from classical mechanics that might capture at least some aspects of this QM issue. Thinking a bit about it, I realized that I had not read anything about this classical mechanics problem during my [very] limited studies of the classical mechanics.
But since it appeared simple enough—heck, it was just classical mechanics—I now tried to reason through it. I thought I “got” it. But then, right the next day, I began doubting my own answer—with very good reasons.
… By now, I had no option but to keep aside the more scholarly task of writing down answers to the E&R questions. The classical problem of my own making had begun becoming all interesting by itself. Naturally, even though I was not procrastinating, I still got away from E&R—I got diverted.
I made some false starts even in the classical version of the problem, but finally, today, I could find some way through it—one which I think is satisfactory. In this post, I am going to share this classical problem. See if it interests you.
Consider an idealized string tautly held between two fixed end supports that are a distance apart; see the figure below. The string can be put into a state of vibrations by plucking it. There is a third support exactly at the middle; it can be removed at will.
Assume all the ideal conditions. For instance, assume perfectly rigid and unyielding supports, and a string that is massive (i.e., one which has a lineal mass density; for simplicity, assume this density to be constant over the entire string length) but having zero thickness. The string also is perfectly elastic and having zero internal friction of any sort. Assume that the string is surrounded by the vacuum (so that the vibrational energy of the string does not leak outside the system). Assume the absence of any other forces such as gravitational, electrical, etc. Also assume that the middle support, when it remains touching the string, does not allow any leakage of the vibrational energy from one part of the string to the other. Feel free to make further suitable assumptions as necessary.
The overall system here consists of the string (sans the supports, whose only role is to provide the necessary boundary conditions).
Initially, the string is stationary. Then, with the middle support touching the string, the left-half of the string is made to undergo oscillations by plucking it somewhere in the left-half only, and immediately releasing it. Denote the instant of the release as, say . After the lapse of a sufficiently long time period, assume that the left-half of the system settles down into a steady-state standing wave pattern. Given our assumptions, the right-half of the system continues to remain perfectly stationary.
The internal energy of the system at is . Energy is put into the system only once, at , and never again. Thus, for all times , the system behaves as a thermodynamically isolated system.
For simplicity, assume that the standing waves in the left-half form the fundamental mode for that portion (i.e. for the length ). Denote the frequency of this fundamental mode as , and its max. amplitude (measured from the central line) as .
Next, at some instant of time , suppose that the support in the middle is suddenly removed, taking care not to disturb the string in any way in the process. That is to say, we neither put in any more energy in the system nor take out of it, in the process of removing the middle support.
Once the support is thus removed, the waves from the left-half can now travel to the right-half, get reflected from the right end-support, travel all the way to the left end-support, get reflected there, etc. Thus, they will travel back and forth, in both the directions.
Modeled as a two-point BV/IC problem, assume that the system settles down into a steadily repeating pattern of some kind of standing waves.
The question now is:
What would be the pattern of the standing waves formed in the system at a time ?
The theory suggests that there is no unique answer!:
Here is one obvious answer:
Since the support in the middle was exactly at the midpoint, removing it has the effect of suddenly doubling the length for the string.
Now, simple maths of the normal modes tells you that the string can vibrate in the fundamental mode for the entire length, which means: the system should show standing waves of the frequency .
However, there also are other, theoretically conceivable, answers.
For instance, it is also possible that the system gets settled into the first higher-harmonic mode. In the very first higher-harmonic mode, it will maintain the same frequency as earlier, i.e., , but being an isolated system, it has to conserve its energy, and so, in this higher harmonic mode, it must vibrate with a lower max. amplitude . Thermodynamically speaking, since the energy is conserved also in such a mode, it also should certainly be possible.
In fact, you can take the argument further, and say that any one or all of the higher harmonics (potentially an infinity of them) would be possible. After all, the system does not have to maintain a constant frequency or a constant max. amplitude; it only has to maintain the same energy.
OK. That was the idealized model and its maths. Now let’s turn to reality.
Relevant empirical observations show that only a certain answer gets selected:
What do you actually observe in reality for systems that come close enough to the above mentioned idealized description? Let’s take a range of examples to get an idea of what kind of a show the real world puts up….
Consider, say, a violinist’s performance. He can continuously alter the length of the vibrations with his finger, and thereby produce a continuous spectrum of frequencies. However, at any instant, for any given length for the vibrating part, the most dominant of all such frequencies is, actually, only the fundamental mode for that length.
A real violin does not come very close to our idealized example above. A flute is better, because its spectrum happens to be the purest among all musical instruments. What do we mean by a “pure” tone here? It means this: When a flutist plays a certain tone, say the middle “saa” (i.e. the middle “C”), the sound actually produced by the instrument does not significantly carry any higher harmonics. That is to say, when a flutist plays the middle “saa,” unlike the other musical instruments, the flute does not inadvertently go on to produce also the “saa”s from any of the higher octaves. Its energy remains very strongly concentrated in only a single tone, here, the middle “saa”. Thus, it is said to be a “pure” tone; it is not “contaminated” by any of the higher harmonics. (As to the lower harmonics for a given length, well, they are ruled out because of the basic physics and maths.)
Now, if you take a flute of a variable length (something like a trumpet) and try very suddenly doubling the length of the vibrating air column, you will find that instead of producing a fainter sound of the same middle “saa”, the flute instead produces the next lower “saa”. (If you want, you can try it out more systematically in the laboratory by taking a telescopic assembly of cylinders and a tuning fork.)
Of course, really speaking, despite its pure tones, even the flute does not come close enough to our idealized description above. For instance, notice that in our idealized description, energy is put into the system only once, at , and never again. On the other hand, in playing a violin or a flute we are continuously pumping in some energy; the system is also continuously dissipating its energy to its environment via the sound waves produced in the air. A flute, thus, is an open system; it is not an isolated system. Yet, despite the additional complexity introduced because of an open system, and therefore, perhaps, a greater chance of being drawn into higher harmonic(s), in reality, a variable length flute is always observed to “select” only the fundamental harmonic for a given length.
How about an actual guitar? Same thing. In fact, the guitar comes closest to our idealized description. And if you try out plucking the string once and then, after a while, suddenly removing the finger from a fret, you will find that the guitar too “prefers” to immediately settle down rather in the fundamental harmonic for the new length. (Take an electric guitar so that even as the sound turns fainter and still fainter due to damping, you could still easily make out the change in the dominant tone.)
OK. Enough of empirical observations. Back to the connection of these observations with the theory of physics (and maths).
Thermodynamically, an infinity of tones are perfectly possible. Maths tells you that these infinity of tones are nothing but the set of the higher harmonics (and nothing else). Yet, in reality, only one tone gets selected. What gives?
What is the missing physics which makes the system get settled into one and only one option—indeed an extreme option—out of an infinity of them of which are, energetically speaking, equally possible?
Update on 18 June 2017:
Here is a statement of the problem in certain essential mathematical terms. See the three figures below:
The initial state of the string is what the following figure (Case 1) depicts. The max. amplitude is 1.0. Though the quiescent part looks longer than half the length, it’s just an illusion of perception.:
The following figure (Case 2) is the mathematical idealization of the state in which an actual guitar string tends to settle in. Note that the max. amplitude is greater (it’s ) so as to have the energy of this state the same as that of Case 1.
The following figure (Case 3) depicts what mathematically is also possible for the final system state. However, it’s not observed with actual guitars. Note, here, the frequency is half of that in the Case 1, and the wavelength is doubled. The max. amplitude for this state is less than 1.0 (it’s ) so as to have this state too carry exactly the same energy as in Case 1.
Thus, the problem, in short is:
The transition observed in reality is: Case 1 Case 2.
However, the transition Case 1 Case 3 also is possible by the mathematics of standing waves and thermodynamics (or more basically, by that bedrock on which all modern physics rests, viz., the calculus of variations). Yet, it is not observed.
Why does only occur? why not ? or even a linear combination of both? That’s the problem, in essence.
While attempting to answer it, also consider this : Can an isolated system like the one depicted in the Case 1 at all undergo a transition of modes?
Update on 18th June 2017 is over.
That was the classical mechanics problem I said I happened to think of, recently. (And it was the one which took me away from the program of answering the E&R questions.)
Find it interesting? Want to give it a try?
If you do give it a try and if you reach an answer that seems satisfactory to you, then please do drop me a line. We can then cross-check our notes.
And of course, if you find this problem (or something similar) already solved somewhere, then my request to you would be stronger: do let me know about the reference!
In the meanwhile, I will try to go back to (or at least towards) completing the task of answering the E&R questions. [I do, however, also plan to post a slightly edited version of this post at iMechanica.]
07 June 2017: Published on this blog
8 June 2017, 12:25 PM, IST: Added the figure and the section headings.
8 June 2017, 15:30 hrs, IST: Added the link to the brief version posted at iMechanica.
18 June 2017, 12:10 hrs, IST: Added the diagrams depicting the mathematical abstraction of the problem.
A Song I Like:
(Marathi) “olyaa saanj veli…”
Singers: Swapnil Bandodkar, Bela Shende
Lyrics: Ashwini Shende
In this post, I provide my answer to the question which I had raised last time, viz., about the differences between the , the , and the (the first two, of the usual calculus, and the last one, of the calculus of variations).
Some pre-requisite ideas:
A system is some physical object chosen (or isolated) for study. For continua, it is convenient to select a region of space for study, in which case that region of space (holding some physical continuum) may also be regarded as a system. The system boundary is an abstraction.
A state of a system denotes a physically unique and reproducible condition of that system. State properties are the properties or attributes that together uniquely and fully characterize a state of a system, for the chosen purposes. The state is an axiom, and state properties are its corollary.
State properties for continua are typically expressed as functions of space and time. For instance, pressure, temperature, volume, energy, etc. of a fluid are all state properties. Since state properties uniquely define the condition of a system, they represent definite points in an appropriate, abstract, (possibly) higher-dimensional state space. For this reason, state properties are also called point functions.
A process (synonymous to system evolution) is a succession of states. In classical physics, the succession (or progression) is taken to be continuous. In quantum mechanics, there is no notion of a process; see later in this post.
A process is often represented as a path in a state space that connects the two end-points of the staring and ending states. A parametric function defined over the length of a path is called a path function.
A cyclic process is one that has the same start and end points.
During a cyclic process, a state function returns to its initial value. However, a path function does not necessarily return to the same value over every cyclic change—it depends on which particular path is chosen. For instance, if you take a round trip from point to point and back, you may spend some amount of money if you take one route but another amount if you take another route. In both cases you do return to the same point viz. , but the amount you spend is different for each route. Your position is a state function, and the amount you spend is a path function.
[I may make the above description a bit more rigorous later on (by consulting a certain book which I don’t have handy right away (and my notes of last year are gone in the HDD crash)).]
The , the , and the :
The denotes a sufficiently small but finite, and locally existing difference in different parts of a system. Typically, since state properties are defined as (continuous) functions of space and time, what the represents is a finite change in some state property function that exists across two different but adjacent points in space (or two nearby instants in times), for a given system.
The is a local quantity, because it is defined and evaluated around a specific point of space and/or time. In other words, an instance of is evaluated at a fixed or . The simply denotes a change of position; it may or may not mean a displacement.
The (i.e. the infinitesimal) is nothing but the taken in some appropriate limiting process to the vanishingly small limit.
Since is locally defined, so is the infinitesimal (i.e. ).
The of CoV is completely different from the above two concepts.
The is a sufficiently small but global difference between the states (or paths) of two different, abstract, but otherwise identical views of the same physically existing system.
Considering the fact that an abstract view of a system is itself a system, also may be regarded as a difference between two systems.
Though differences in paths are not only possible but also routinely used in CoV, in this post, to keep matters simple, we will mostly consider differences in the states of the two systems.
In CoV, the two states (of the two systems) are so chosen as to satisfy the same Dirichlet (i.e. field) boundary conditions separately in each system.
The state function may be defined over an abstract space. In this post, we shall not pursue this line of thought. Thus, the state function will always be a function of the physical, ambient space (defined in reference to the extensions and locations of concretely existing physical objects).
Since a state of a system of nonzero size can only be defined by specifying its values for all parts of a system (of which it is a state), a difference between states (of the two systems involved in the variation ) is necessarily global.
In defining , both the systems are considered only abstractly; it is presumed that at most one of them may correspond to an actual state of a physical system (i.e. a system existing in the physical reality).
The idea of a process, i.e. the very idea of a system evolution, necessarily applies only to a single system.
What the represents is not an evolution because it does not represent a change in a system, in the first place. The variation, to repeat, represents a difference between two systems satisfying the same field boundary conditions. Hence, there is no evolution to speak of. When compressed air is passed into a rubber balloon, its size increases. This change occurs over certain time, and is an instance of an evolution. However, two rubber balloons already inflated to different sizes share no evolutionary relation with each other; there is no common physical process connecting the two; hence no change occurring over time can possibly enter their comparative description.
Thus, the “change” denoted by is incapable of representing a process or a system evolution. In fact, the word “change” itself is something of a misnomer here.
Text-books often stupidly try to capture the aforementioned idea by saying that represents a small and possibly finite change that occurs without any elapse of time. Apart from the mind-numbing idea of a finite change occurring over no time (or equally stupefying ideas which it suggests, viz., a change existing at literally the same instant of time, or, alternatively, a process of change that somehow occurs to a given system but “outside” of any time), what they, in a way, continue to suggest also is the erroneous idea that we are working with only a single, concretely physical system, here.
But that is not the idea behind at all.
To complicate the matters further, no separate symbol is used when the variation is made vanishingly small.
In the primary sense of the term variation (or ), the difference it represents is finite in nature. The variation is basically a function of space (and time), and at every value of (and ), the value of is finite, in the primary sense of the word. Yes, these values can be made vanishingly small, though the idea of the limits applied in this context is different. (Hint: Expand each of the two state functions in a power series and relate each of the corresponding power terms via a separate parameter. Then, put the difference in each parameter through a limiting process to vanish. You may also use the Fourier expansion.))
The difference represented by is between two abstract views of a system. The two systems are related only in an abstract view, i.e., only in (the mathematical) thought. In the CoV, they are supposed as connected, but the connection between them is not concretely physical because there are no two separate physical systems concretely existing, in the first place. Both the systems here are mathematical abstractions—they first have been abstracted away from the real, physical system actually existing out there (of which there is only a single instance).
But, yes, there is a sense in which we can say that does have a physical meaning: it carries the same physical units as for the state functions of the two abstract systems.
An example from biology:
Here is an example of the differences between two different paths (rather than two different states).
Plot the height of a growing sapling at different times, and connect the dots to yield a continuous graph of the height as a function of time. The difference in the heights of the sapling at two different instants is . But if you consider two different saplings planted at the same time, and assuming that they grow to the same final height at the end of some definite time period (just pick some moment where their graphs cross each other), and then, abstractly regarding them as some sort of imaginary plants, if you plot the difference between the two graphs, that is the variation or in the height-function of either. The variation itself is a function (here of time); it has the units, of course, of m.
The is a local change inside a single system, and is its limiting value, whereas the is a difference across two abstract systems differing in their global states (or global paths), and there is no separate symbol to capture this object in the vanishingly small limit.
Consider one period of the function , say over the interval ; is a small, real-valued, constant. Now, set . Is the change/difference here a or a ? Why or why not?
Now, take the derivative, i.e., , with once again. Is the change/difference here a or a ? Why or why not?
Which one of the above two is a bigger change/difference?
Also consider this angle: Taking the derivative did affect the whole function. If so, why is it that we said that was necessarily a local change?
An important and special note:
The above exercises, I am sure, many (though not all) of the Officially Approved Full Professors of Mechanical Engineering at the Savitribai Phule Pune University and COEP would be able to do correctly. But the question I posed last time was: Would it be therefore possible for them to spell out the physical meaning of the variation i.e. ? I continue to think not. And, importantly, even among those who do solve the above exercises successfully, they wouldn’t be too sure about their own answers. Upon just a little deeper probing, they would just throw up their hands. [Ditto, for many American physicists.] Even if a conceptual clarity is required in applications.
(I am ever willing and ready to change my mind about it, but doing so would need some actual evidence—just the way my (continuing) position had been derived, in the first place, from actual observations of them.)
The reason I made this special note was because I continue to go jobless, and nearly bank balance-less (and also, nearly cashless). And it all is basically because of folks like these (and the Indians like the SPPU authorities). It is their fault. (And, no, you can’t try to lift what is properly their moral responsibility off their shoulders and then, in fact, go even further, and attempt to place it on mine. Don’t attempt doing that.)
A Song I Like:
May be I have run this song before. If yes, I will replace it with some other song tomorrow or so. No I had not.]
Hindi: “Thandi hawaa, yeh chaandani suhaani…”
Music and Singer: Kishore Kumar
Lyrics: Majrooh Sultanpuri
[A quick ‘net search on plagiarism tells me that the tune of this song was lifted from Julius La Rosa’s 1955 song “Domani.” I heard that song for the first time only today. I think that the lyrics of the Hindi song are better. As to renditions, I like Kishor Kumar’s version better.]
Minor editing may be done later on and the typos may be corrected, but the essentials of my positions won’t be. Mostly done right today, i.e., on 06th January, 2017.]
I was looking for a certain book on heat transfer which I had (as usual) misplaced somewhere, and while searching for that book at home, I accidentally ran into another book I had—the one on Classical Mechanics by Rana and Joag [^].
After dusting this book a bit, I spent some time in one typical way, viz. by going over some fond memories associated with a suddenly re-found book…. The memories of how enthusiastic I once was when I had bought that book; how I had decided to finish that book right within weeks of buying it several years ago; the number of times I might have picked it up, and soon later on, kept it back aside somewhere, etc. …
Yes, that’s right. I have not yet managed to finish this book. Why, I have not even managed to begin reading this book the way it should be read—with a paper and pencil at hand to work through the equations and the problems. That was the reason why, I now felt a bit guilty. … It just so happened that it was just the other day (or so) when I was happily mentioning the Poisson brackets on Prof. Scott Aaronson’s blog, at this thread [^]. … To remove (at least some part of) my sense of guilt, I then decided to browse at least through this part (viz., Poisson’s brackets) in this book. … Then, reading a little through this chapter, I decided to browse through the preceding chapters from the Lagrangian mechanics on which it depends, and then, in general, also on the calculus of variations.
It was at this point that I suddenly happened to remember the reason why I had never been able to finish (even the portions relevant to engineering from) this book.
The thing was, the explanation of the —the delta of the variational calculus.
The explanation of what the basically means, I had found right back then (many, many years ago), was not satisfactorily given in this book. The book did talk of all those things like the holonomic constraints vs. the nonholonomic constraints, the functionals, integration by parts, etc. etc. etc. But without ever really telling me, in a forth-right and explicit manner, what the hell this was basically supposed to mean! How this was different from the finite changes () and the infinitesimal changes () of the usual calculus, for instance. In terms of its physical meaning, that is. (Hell, this book was supposed to be on physics, wasn’t it?)
Here, I of course fully realize that describing Rana and Joag’s book as “unsatisfactory” is making a rather bold statement, a very courageous one, in fact. This book is extraordinarily well-written. And yet, there I was, many, many years ago, trying to understand the delta, and not getting anywhere, not even with this book in my hand. (OK, a confession. The current copy which I have is not all that old. My old copy is gone by now (i.e., permanently misplaced or so), and so, the current copy is the one which I had bought once again, in 2009. As to my old copy, I think, I had bought it sometime in the mid-1990s.)
It was many years later, guess some time while teaching FEM to the undergraduates in Mumbai, that the concept had finally become clear enough to me. Most especially, while I was going through P. Seshu’s and J. N. Reddy’s books. [Reflected Glory Alert! Professor P. Seshu was my class-mate for a few courses at IIT Madras!] However, even then, even at that time, I remember, I still had this odd feeling that the physical meaning was still not clear to me—not as as clear as it should be. The matter eventually became “fully” clear to me only later on, while musing about the differences between the perspective of Thermodynamics on the one hand and that of Heat Transfer on the other. That was some time last year, while teaching Thermodynamics to the PG students here in Pune.
Thermodynamics deals with systems at equilibria, primarily. Yes, its methods can be extended to handle also the non-equilibrium situations. However, even then, the basis of the approach summarily lies only in the equilibrium states. Heat Transfer, on the other hand, necessarily deals with the non-equilibrium situations. Remove the temperature gradient, and there is no more heat left to speak of. There does remain the thermal energy (as a form of the internal energy), but not heat. (Remember, heat is the thermal energy in transit that appears on a system boundary.) Heat transfer necessarily requires an absence of thermal equilibrium. … Anyway, it was while teaching thermodynamics last year, and only incidentally pondering about its differences from heat transfer, that the idea of the variations (of Cov) had finally become (conceptually) clear to me. (No, CoV does not necessarily deal only with the equilibrium states; it’s just that it was while thinking about the equilibrium vs. the transient that the matter about CoV had suddenly “clicked” to me.)
In this post, let me now note down something on the concept of the variation, i.e., towards understanding the physical meaning of the symbol .
Please note, I have made an inline update on 26th December 2016. It makes the presentation of the calculus of variations a bit less dumbed down. The updated portion is clearly marked as such, in the text.
The Problem Description:
The concept of variations is abstract. We would be better off considering a simple, concrete, physical situation first, and only then try to understand the meaning of this abstract concept.
Accordingly, consider a certain idealized system. See its schematic diagram below:
There is a long, rigid cylinder made from some transparent material like glass. The left hand-side end of the cylinder is hermetically sealed with a rigid seal. At the other end of the cylinder, there is a friction-less piston which can be driven by some external means.
Further, there also are a couple of thin, circular, piston-like disks ( and ) placed inside the cylinder, at some and positions along its length. These disks thus divide the cylindrical cavity into three distinct compartments. The disks are assumed to be impermeable, and fitting snugly, they in general permit no movement of gas across their plane. However, they also are assumed to be able to move without any friction.
Initially, all the three compartments are filled with a compressible fluid to the same pressure in each compartment, say 1 atm. Since all the three compartments are at the same pressure, the disks stay stationary.
Then, suppose that the piston on the extreme right end is moved, say from position to . The final position may be to the left or to the right of the initial position ; it doesn’t matter. For the current description, however, let’s suppose that the position is to the left of . The effect of the piston movement thus is to increase the pressure inside the system.
The problem is to determine the nature of the resulting displacements that the two disks undergo as measured from their respective initial positions.
There are essentially two entirely different paradigms for conducting an analysis of this problem.
The “Vector Mechanics” Paradigm:
The first paradigm is based on an approach that was put to use so successfully by Newton. Usually, it is called the paradigm of vector analysis.
In this paradigm, we focus on the fact that the forced displacement of the piston with time, , may be described using some function of time that is defined over the interval lying between two instants and .
For example, suppose the function is:
where is a constant. In other words, the motion of the piston is steady, with a constant velocity, between the initial and final instants. Since the velocity is constant, there is no acceleration over the open interval .
However, notice that before the instant , the piston velocity was zero. Then, the velocity suddenly became a finite (constant) value. Therefore, if you extend the interval to include the end-instants as well, i.e., if you consider the semi-closed interval , then there is an acceleration at the instant . Similarly, since the piston comes to a position of rest at , there also is another acceleration, equal in magnitude and opposite in direction, which appears at the instant .
The existence of these two instantaneous accelerations implies that jerks or pressure waves are sent through the system. We may model them as vector quantities, as impulses. [Side Exercise: Work out what happens if we consider only the open interval .]
We can now apply Newton’s 3 laws, based on the idea that shock-waves must have begun at the piston at the instant . They must have got transmitted through the gas kept under pressure, and they must have affected the disk lying closest to the piston, thereby setting this disk into motion. This motion must have passed through the gas in the middle compartment of the system as another pulse in the pressure (generated at the disk ), thereby setting also the disk in a state of motion a little while later. Finally, the pulse must have got bounced off the seal on the left hand side, and in turn, come back to affect the motion of the disk , and then of the disk . Continuing their travels to and fro, the pulses, and hence the disks, would thus be put in a back and forth motion.
After a while, these transients would move forth and back, superpose, and some of their constituent frequencies would get cancelled out, leaving only those frequencies operative such that the three compartments are put under some kind of stationary states.
In case the gas is not ideal, there would be damping anyway, and after a sufficiently long while, the disks would move through such small displacements that we could easily ignore the ever-decreasing displacements in a limiting argument.
Thus, assume that, after an elapse of a sufficiently long time, the disks become stationary. Of course, their new positions are not the same as their original positions.
The problem thus can be modeled as basically a transient one. The state of the new equilibrium state is thus primarily seen as an effect or an end-result of a couple of transient processes which occur in the forward and backward directions. The equilibrium is seen as not a primarily existing state, but as a result of two equal and opposite transient causes.
Notice that throughout this process, Newton’s laws can be applied directly. The nature of the analysis is such that the quantities in question—viz. the displacements of the disks—always are real, i.e., they correspond to what actually is supposed to exist in the reality out there.
The (values of) displacements are real in the sense that the mathematical analysis procedure itself involves only those (values of) displacements which can actually occur in reality. The analysis does not concern itself with some other displacements that might have been possible but don’t actually occur. The analysis begins with the forced displacement condition, translates it into pressure waves, which in turn are used in order to derive the predicted displacements in the gas in the system, at each instant. Thus, at any arbitrary instant of time (in fact, the analysis here runs for times ), the analysis remains concerned only with those displacements that are actually taking place at that instant.
The Method of Calculus of Variations:
The second paradigm follows the energetics program. This program was initiated by Newton himself as well as by Leibnitz. However, it was pursued vigorously not by Newton but rather by Leibnitz, and then by a series of gifted mathematicians-physicists: the Bernoulli brothers, Euler, Lagrange, Hamilton, and others. This paradigm is essentially based on the calculus of variations. The idea here is something like the following.
We do not care for a local description at all. Thus, we do not analyze the situation in terms of the local pressure pulses, their momenta/forces, etc. All that we focus on are just two sets of quantities: the initial positions of the disks, and their final positions.
For instance, focus on the disk . It initially is at the position . It is found, after a long elapse of time (i.e., at the next equilibrium state), to have moved to . The question is: how to relate this change in on the one hand, to the displacement that the piston itself undergoes from to .
To analyze this question, the energetics program (i.e., the calculus of variations) adopts a seemingly strange methodology.
It begins by saying that there is nothing unique to the specific value of the position as assumed by the disk . The disk could have come to a halt at any other (nearby) position, e.g., at some other point , or , or , … etc. In fact, since there are an infinity of points lying in a finite segment of line, there could have been an infinity of positions where the disk could have come to a rest, when the new equilibrium was reached.
Of course, in reality, the disk comes to a halt at none of these other positions; it comes to a halt only at .
Yet, the theory says, we need to be “all-inclusive,” in a way. We need not, just for the aforementioned reason, deny a place in our analysis to these other positions. The analysis must include all such possible positions—even if they be purely hypothetical, imaginary, or unreal. What we do in the analysis, this paradigm says, is to initially include these merely hypothetical, unrealistic positions too on exactly the same footing as that enjoyed by that one position which is realistic, which is given by .
Thus, we take a set of all possible positions for each disk. Then, for each such a position, we calculate the “impact” it would make on the energy of the system taken as a whole.
The energy of the system can be additively decomposed into the energies carried by each of its sub-parts. Thus, focusing on disk , for each one of its possible (hypothetical) final position, we should calculate the energies carried by both its adjacent compartments. Since a change in ‘s position does not affect the compartment 3, we need not include it. However, for the disk , we do need to include the energies carried by both the compartments 1 and 2. Similarly, for each of the possible positions occupied by the disk , it should include the energies of the compartments 2 and 3, but not of 1.
At this point, to bring simplicity (and thereby better) clarity to this entire procedure, let us further assume that the possible positions of each disk forms a finite set. For instance, each disk can occupy only one of the positions that is some or distance-units away from its initial position. Thus, a disk is not allowed to come to a rest at, say, units; it must do so either at or at units. (We will thus perform the initial analysis in terms of only the integer positions, and only later on extend it to any real-valued positions.) (If you are a mechanical engineering student, suggest a suitable mechanism that can ensure only integer relative displacements.)
The change in energy of a compartment is given by
where is the pressure, is the cross-sectional area of the cylinder, and is the change in the length of the compartment.
Now, observe that the energy of the middle compartment depends on the relative distance between the two disks lying on its sides. Yet, for the same reason, the energy of the middle compartment does depend on both these positions. Hence, we must take a Cartesian product of the relative displacements undergone by both the disks, and only then calculate the system energy for each such a permutation (i.e. the ordered pair) of their positions. Let us go over the details of the Cartesian product.
The Cartesian product of the two positions may be stated as a row-by-row listing of ordered pairs of the relative positions of and , e.g., as follows: the ordered pair means that the disk is units to the left of its initial position, and the disk is units to the right of its initial position. Since each of the two positions forming an ordered pair can range over any of the above-mentioned number of different values, there are, in all, number of such possible ordered pairs in the Cartesian product.
For each one of these different pairs, we use the above-given formula to determine what the energy of each compartment is like. Then, we add the three energies (of the three compartments) together to get the value of the energy of the system as a whole.
In short, we get a set of possible values for the energy of the system.
You must have noticed that we have admitted every possible permutation into analysis—all the number of them.
Of course, out of all these number of permutations of positions, it should turn out that number of them have to be discarded because they would be merely hypothetical, i.e. unreal. That, in turn, is because, the relative positions of the disks contained in one and only one ordered pair would actually correspond to the final, equilibrium position. After all, if you conduct this experiment in reality, you would always get a very definite pair of the disk-positions, and it this same pair of relative positions that would be observed every time you conducted the experiment (for the same piston displacement). Real experiments are reproducible, and give rise to the same, unique result. (Even if the system were to be probabilistic, it would have to give rise to an exactly identical probability distribution function.) It can’t be this result today and that result tomorrow, or this result in this lab and that result in some other lab. That simply isn’t science.
Thus, out of all those different ordered-pairs, one and only one ordered-pair would actually correspond to reality; the rest all would be merely hypothetical.
The question now is, which particular pair corresponds to reality, and which ones are unreal. How to tell the real from the unreal. That is the question.
Here, the variational principle says that the pair of relative positions that actually occurs in reality carries a certain definite, distinguishing attribute.
The system-energy calculated for this pair (of relative displacements) happens to carry the lowest magnitude from among all possible number of pairs. In other words, any hypothetical or unreal pair has a higher amount of system energy associated with it. (If two pairs give rise to the same lowest value, both would be equally likely to occur. However, that is not what provably happens in the current example, so let us leave this kind of a “degeneracy” aside for the purposes of this post.)
(The update on 26 December 2016 begins here:)
Actually, the description given in the immediately preceding paragraph was a bit too dumbed down. The variational principle is more subtle than that. Explaining it makes this post even longer, but let me give it a shot anyway, at least today.
To follow the actual idea of the variational principle (in a not dumbed-down manner), the procedure you have to follow is this.
First, make a table of all possible relative-position pairs, and their associated energies. The table has the following columns: a relative-position pair, the associated energy as calculated above, and one more column which for the time being would be empty. The table may look something like what the following (partial) listing shows:
(0,0) -> say, 115 Joules
(-1,0) -> say, 101 Joules
(-2,0) -> say, 110 Joules
(2,2) -> say, 102 Joules
(2,3) -> say, 100 Joules
(2,4) -> say, 101 Joules
(2,5) -> say, 120 Joules
(5,0) -> say, 135 Joules
(5,5) -> say 117 Joules.
Having created this table (of rows), you then pick each row one by and one, and for the picked up -th row, you ask a question: What all other row(s) from this table have their relative distance pairs such that these pairs lie closest to the relative distance pair of this given row. Let me illustrate this question with a concrete example. Consider the row which has the relative-distance pair given as (2,3). Then, the relative distance pairs closest to this one would be obtained by adding or subtracting a distance of 1 to each in the pair. Thus, the relative distance pairs closest to this one would be: (3,3), (1,3), (2,4), and (2,2). So, you have to pick up those rows which have these four entries in the relative-distance pairs column. Each of these four pairs represents a variation on the chosen state, viz. the state (2,3).
In symbolic terms, suppose for the -th row being considered, the rows closest to it in terms of the differences in their relative distance pairs, are the -th, -th, -th and -th rows. (Notice that the rows which are closest to a given row in this sense, would not necessarily be found listed just above or below that given row, because the scheme followed while creating the list or the vector that is the table would not necessarily honor the closest-lying criterion (which necessarily involves two numbers)—not at least for all rows in the table.
OK. Then, in the next step, you find the differences in the energies of the -th row from each of these closest rows, viz., the -th, -th, -th and -th rows. That is to say, you find the absolute magnitudes of the energy differences. Let us denote these magnitudes as: , , and . Suppose the minimum among these values is . So, against the -th row, in the last column of the table, you write the value .
Having done this exercise separately for each row in the table, you then ask: Which row has the smallest entry in the last column (the one for ), and you pick that up. That is the distinguished (or the physically occurring) state.
In other words, the variational principle asks you to select not the row with the lowest absolute value of energy, but that row which shows the smallest difference of energy from one of its closest neighbours—and these closest neighbours are to be selected according to the differences in each number appearing in the relative-distance pair, and not according to the vertical place of rows in the tabular listing. (It so turns out that in this example, the row thus selected following both criteria—lowest energy as well as lowest variation in energy—are identical, though it would not necessarily always be the case. In short, we can’t always get away with the first, too dumbed down, version.)
Thus, the variational principle is about that change in the relative positions for which the corresponding change in the energy vanishes (or has the minimum possible absolute magnitude, in case the positions form a discretely varying, finite set).
(The update on 26th December 2016 gets over here.)
And, it turns out that this approach, too, is indeed able to perfectly predict the final disk-positions—precisely as they actually are observed in reality.
If you allow a continuum of positions (instead of the discrete set of only the number of different final positions for one disk, or number of ordered pairs), then instead of taking a Cartesian product of positions, what you have to do is take into account a tensor product of the position functions. The maths involved is a little more advanced, but the underlying algebraic structure—and the predictive principle which is fundamentally involved in the procedure—remains essentially the same. This principle—the variational principle—says:
Among all possible variations in the system configurations, that system configuration corresponds to reality which has the least variation in energy associated with it.
(This is a very rough statement, but it will do for this post and for a general audience. In particular, we don’t look into the issues of what constitute the kinematically admissible constraints, why the configurations must satisfy the field boundary conditions, the idea of the stationarity vs. of a minimum or a maximum, i.e., the issue of convexity-vs.-concavity, etc. The purpose of this post—and our example here—are both simple enough that we need not get into the whole she-bang of the variational theory as such.)
Notice that in this second paradigm, (i) we did not restrict the analysis to only those quantities that are actually taking place in reality; we also included a host (possibly an infinity) of purely hypothetical combinations of quantities too; (ii) we worked with energy, a scalar quantity, rather than with momentum, a vector quantity; and finally, (iii) in the variational method, we didn’t bother about the local details. We took into account the displacements of the disks, but not any displacement at any other point, say in the gas. We did not look into presence or absence of a pulse at one point in the gas as contrasted from any other point in it. In short, we did not discuss the details local to the system either in space or in time. We did not follow the system evolution, at all—not at least in a detailed, local way. If we were to do that, we would be concerned about what happens in the system at the instants and at spatial points other than the initial and final disk positions. Instead, we looked only at a global property—viz. the energy—whether at the sub-system level of the individual compartments, or at the level of the overall system.
The Two Paradigms Contrasted from Each Other:
If we were to follow Newton’s method, it would be impossible—impossible in principle—to be able to predict the final disk positions unless all their motions over all the intermediate transient dynamics (occurring over each moment of time and at each place of the system) were not be traced. Newton’s (or vectorial) method would require us to follow all the details of the entire evolution of all parts of the system at each point on its evolution path. In the variational approach, the latter is not of any primary concern.
Yet, in following the energetics program, we are able to predict the final disk positions. We are able to do that without worrying about what all happened before the equilibrium gets established. We remain concerned only with certain global quantities (here, system-energy) at each of the hypothetical positions.
The upside of the energetics program, as just noted, is that we don’t have to look into every detail at every stage of the entire transient dynamics.
Its downside is that we are able to talk only of the differences between certain isolated (hypothetical) configurations or states. The formalism is unable to say anything at all about any of the intermediate states—even if these do actually occur in reality. This is a very, very important point to keep in mind.
Now, the question with which we began this post. Namely, what does the delta of the variational calculus mean?
Referring to the above discussion, note that the delta of the variational calculus is, here, nothing but a change in the position-pair, and also the corresponding change in the energy.
Thus, in the above example, the difference of the state (2,3) from the other close states such as (3,3), (1,3), (2,4), and (2,2) represents a variation in the system configuration (or state), and for each such a variation in the system configuration (or state), there is a corresponding variation in the energy of the system. That is what the delta refers to, in this example.
Now, with all this discussion and clarification, would it be possible for you to clearly state what the physical meaning of the delta is? To what precisely does the concept refer? How does the variation in energy differ from both the finite changes () as well as the infinitesimal changes () of the usual calculus?
Note, the question is conceptual in nature. And, no, not a single one of the very best books on classical mechanics manages to give a very succinct and accurate answer to it. Not even Rana and Joag (or Goldstein, or Feynman, or…)
I will give my answer in my next post, next year. I will also try to apply it to a couple of more interesting (and somewhat more complicated) physical situations—one from engineering sciences, and another from quantum mechanics!
In the meanwhile, think about it—the delta—the concept itself, its (conceptual) meaning. (If you already know the calculus of variations, note that in my above write-up, I have already supplied the answer, in a way. You just have to think a bit about it, that’s all!)
An Important Note: Do bring this post to the notice of the Officially Approved Full Professors of Mechanical Engineering in SPPU, and the SPPU authorities. I would like to know if the former would be able to state the meaning—at least now that I have already given the necessary context in such great detail.
Ditto, to the Officially Approved Full Professors of Mechanical Engineering at COEP, esp. D. W. Pande, and others like them.
After all, this topic—Lagrangian mechanics—is at the core of Mechanical Engineering, even they would agree. In fact, it comes from a subject that is not taught to the metallurgical engineers, viz., the topic of Theory of Machines. But it is taught to the Mechanical Engineers. That’s why, they should be able to crack it, in no time.
(Let me continue to be honest. I do not expect them to be able to crack it. But I do wish to know if they are able at least to give a try that is good enough!)
Even though I am jobless (and also nearly bank balance-less, and also cashless), what the hell! …
…Season’s greetings and best wishes for a happy new year!
A Song I Like:
[With jobless-ness and all, my mood isn’t likely to stay this upbeat, but anyway, while it lasts, listen to this song… And, yes, this song is like, it’s like, slightly more than 60 years old!]
(Hindi) “yeh raat bhigee bhigee”
Singers: Manna De and Lata Mangeshkar
I realized that it was the end of November the other day, and it somehow struck me that I should check out if there has been any news on the Infosys prizes for this year. I vaguely recalled that they make the yearly announcements sometime in the last quarter of a year.
Turns out that, although academic bloggers whose blogs I usually check out had not highlighted this news, the prizes had already been announced right in mid-November [^].
It also turns out also that, yes, I “know”—i.e., have in-person chatted (exactly once) with—one of the recipients. I mean Professor Dr. Umesh Waghmare, who received this year’s award for Engineering Sciences [^]. I had run into him in an informal conference once, and have written about it in a recent post, here [^].
Dr. Waghmare is a very good choice, if you ask me. His work is very neat—I mean both the ideas which he picks out to work on, and the execution on them.
I still remember his presentation at that informal conference (where I chatted with him). He had talked about a (seemingly) very simple idea, related to graphene [^]—its buckling.
Here is my highly dumbed down version of that work by Waghmare and co-authors. (It’s dumbed down a lot—Waghmare et al’s work was on buckling, not bending. But it’s OK; this is just a blog, and guess I have a pretty general sort of a “general readership” here.)
Bending, in general, sets up a combination of tensile and compressive stresses, which results in the setting up of a bending moment within a beam or a plate. All engineers (except possibly for the “soft” branches like CS and IT) study bending quite early in their undergraduate program, typically in the second year. So, I need not explain its analysis in detail. In fact, in this post, I will write only a common-sense level description of the issue. For technical details, look up the Wiki articles on bending [^] and buckling [^] or Prof. Bower’s book [^].
Assuming you are not an engineer, you can always take a longish rubber eraser, hold it so that its longest edge is horizontal, and then bend it with a twist of your fingers. If the bent shape is like an inverted ‘U’, then, the inner (bottom) surface has got compressed, and the outer (top) surface has got stretched. Since compression and tension are opposite in nature, and since the eraser is a continuous body of a finite height, it is easy to see that there has to be a continuous surface within the volume of the eraser, some half-way through its height, where there can be no stresses. That’s because, the stresses change sign in going from the compressive stress at the bottom surface to the tensile stresses on the top surface. For simplicity of mathematics, this problem is modeled as a 1D (line) element, and therefore, in elasticity theory, this actual 2D surface is referred to as the neutral axis (i.e. a line).
The deformation of the eraser is elastic, which means that it remains in the bent state only so long as you are applying a bending “force” to it (actually, it’s a moment of a force).
The classical theory of bending allows you to relate the curvature of the beam, and the bending moment applied to it. Thus, knowing bending moment (or the applied forces), you can tell how much the eraser should bend. Or, knowing how much the eraser has curved, you can tell how big a pair of fforces would have to be applied to its ends. The theory works pretty well; it forms of the basis of how most buildings are designed anyway.
So far, so good. What happens if you bend, not an eraser, but a graphene sheet?
The peculiarity of graphene is that it is a single atom-thick sheet of carbon atoms. Your usual eraser contains billions and billions of layers of atoms through its thickness. In contrast, the thickness of a graphene sheet is entirely accounted for by the finite size of the single layer of atoms. And, it is found that unlike thin paper, the graphen sheet, even if it is the the most extreme case of a thin sheet, actually does offer a good resistance to bending. How do you explain that?
The naive expectation is that something related to the interatomic bonding within this single layer must, somehow, produce both the compressive and tensile stresses—and the systematic variation from the locally tensile to the locally compressive state as we go through this thickness.
Now, at the scale of single atoms, quantum mechanical effects obviously are dominant. Thus, you have to consider those electronic orbitals setting up the bond. A shift in the density of the single layer of orbitals should correspond to the stresses and strains in the classical mechanics of beams and plates.
What Waghmare related at that conference was a very interesting bit.
He calculated the stresses as predicted by (in my words) the changed local density of the orbitals, and found that the forces predicted this way are way smaller than the experimentally reported values for graphene sheets. In other words, the actual graphene is much stiffer than what the naive quantum mechanics-based model shows—even if the model considers those electronic orbitals. What is the source of this additional stiffness?
He then showed a more detailed calculation (i.e. a simulation), and found that the additional stiffness comes from a quantum-mechanical interaction between the portions of the atomic orbitals that go off transverse to the plane of the graphene sheet.
Thus, suppose a graphene sheet is initially held horizontally, and then bent to form an inverted U-like curvature. According to Waghmare and co-authros, you now have to consider not just the orbital cloud between the atoms (i.e. the cloud lying in the same plane as the graphene sheet) but also the orbital “petals” that shoot vertically off the plane of the graphene. Such petals are attached to nucleus of each C atom; they are a part of the electronic (or orbital) structure of the carbon atoms in the graphene sheet.
In other words, the simplest engineering sketch for the graphene sheet, as drawn in the front view, wouldn’t look like a thin horizontal line; it would also have these small vertical “pins” at the site of each carbon atom, overall giving it an appearance rather like a fish-bone.
What happens when you bend the graphene sheet is that on the compression side, the orbital clouds for these vertical petals run into each other. Now, you know that an orbital cloud can be loosely taken as the electronic charge density, and that the like charges (e.g. the negatively charged electrons) repel each other. This inter-electronic repulsive force tends to oppose the bending action. Thus, it is the petals’ contribution which accounts for the additional stiffness of the graphene sheet.
I don’t know whether this result was already known to the scientific community back then in 2010 or not, but in any case, it was a very early analysis of bending of graphene. Further, as far as I could tell, the quality of Waghmare’s calculations and simulations was very definitely superlative. … You work in a field (say computational modeling) for some time, and you just develop a “nose” of sorts, that allows you to “smell” a superlative calculation from an average one. Particularly so, if your own skills on the calculations side are rather on the average, as happens to be the case with me. (My strengths are in conceptual and computational sides, but not on the mathematical side.) …
So, all in all, it’s a very well deserved prize. Congratulations, Dr. Waghmare!
A Song I Like:
(The so-called “fusion” music) “Jaisalmer”
Artists: Rahul Sharma (Santoor) and Richard Clayderman (Piano)
[As usual, may be one more editing pass…]