Learnability of machine learning is provably an undecidable?—part 3: closure

Update on 23 January 2019, 17:55 IST:

In this series of posts, which was just a step further from the initial, brain-storming kind of a stage, I had come to the conclusion that based on certain epistemological (and metaphysical) considerations, Ben-David et al.’s conclusion (that learnability can be an undecidable) is logically untenable.

However, now, as explained here [^], I find that this particular conclusion which I drew, was erroneous. I now stand corrected, i.e., I now consider Ben-David et al.’s result to be plausible. Obviously, it merits a further, deeper, study.

However, even as acknowledging the above-mentioned mistake, let me also hasten to clarify that I still stick to my other positions, especially the central theme in this series of posts. The central theme here was that there are certain core features of the set theory which make implications such as Godel’s incompleteness theorems possible. These features (of the set theory) demonstrably carry a glaring epistemological flaw such that applying Godel’s theorem outside of its narrow technical scope in mathematics or computer science is not permissible. In particular, Godel’s incompleteness theorem does not apply to knowledge or its validation in the more general sense of these terms. This theme, I believe, continues to hold as is.

Update over.


Gosh! I gotta get this series out of my hand—and also head! ASAP, really!! … So, I am going to scrap the bits and pieces I had written for it earlier; they would have turned this series into a 4- or 5-part one. Instead, I am going to start entirely afresh, and I am going to approach this topic from an entirely different angle—a somewhat indirect but a faster route, sort of like a short-cut. Let’s get going.


Statements:

Open any article, research paper, book or a post, and what do you find? Basically, all these consist of sentences after sentences. That is, a series of statements, in a way. That’s all. So, let’s get going at the level of statements, from a “logical” (i.e. logic-thoretical) point of view.

Statements are made to propose or to identify (or at least to assert) some (or the other) fact(s) of reality. That’s what their purpose is.


The conceptual-level consciousness as being prone to making errors:

Coming to the consciousness of man, there are broadly two levels of cognition at which it operates: the sensory-perceptual, and the conceptual.

Examples of the sensory-perceptual level consciousness would consist of reaching a mental grasp of such facts of reality as: “This object exists, here and now;” “this object has this property, to this much degree, in reality,” etc. Notice that what we have done here is to take items of perception, and put them into the form of propositions.

Propositions can be true or false. However, at the perceptual level, a consciousness has no choice in regard to the truth-status. If the item is perceived, that’s it! It’s “true” anyway. Rather, perceptions are not subject to a test of truth- or false-hoods; they are at the very base standards of deciding truth- or false-hoods.

A consciousness—better still, an organism—does have some choice, even at the perceptual level. The choice which it has exists in regard to such things as: what aspect of reality to focus on, with what degree of focus, with what end (or purpose), etc. But we are not talking about such things here. What matters to us here is just the truth-status, that’s all. Thus, keeping only the truth-status in mind, we can say that this very idea itself (of a truth-status) is inapplicable at the purely perceptual level. However, it is very much relevant at the conceptual level. The reason is that at the conceptual level, the consciousness is prone to err.

The conceptual level of consciousness may be said to involve two different abilities:

  • First, the ability to conceive of (i.e. create) the mental units that are the concepts.
  • Second, the ability to connect together the various existing concepts to create propositions which express different aspects of the truths pertaining to them.

It is possible for a consciousness to go wrong in either of the two respects. However, mistakes are much more easier to make when it comes to the second respect.

Homework 1: Supply an example of going wrong in the first way, i.e., right at the stage of forming concepts. (Hint: Take a concept that is at least somewhat higher-level so that mistakes are easier in forming it; consider its valid definition; then modify its definition by dropping one of its defining characteristics and substituting a non-essential in it.)

Homework 2: Supply a few examples of going wrong in the second way, i.e., in forming propositions. (Hint: I guess almost any logical fallacy can be taken as a starting point for generating examples here.)


Truth-hood operator for statements:

As seen above, statements (i.e. complete sentences that formally can be treated as propositions) made at the conceptual level can, and do, go wrong.

We therefore define a truth-hood operator which, when it operates on a statement, yields the result as to whether the given statement is true or non-true. (Aside: Without getting into further epistemological complexities, let me note here that I reject the idea of the arbitrary, and thus regard non-true as nothing but a sub-category of the false. Thus, in my view, a proposition is either true or it is false. There is no middle (as Aristotle said), or even an “outside” (like the arbitrary) to its truth-status.)

Here are a few examples of applying the truth-status (or truth-hood) operator to a statement:

  • Truth-hood[ California is not a state in the USA ] = false
  • Truth-hood[ Texas is a state in the USA ] = true
  • Truth-hood[ All reasonable people are leftists ] = false
  • Truth-hood[ All reasonable people are rightists ] = false
  • Truth-hood[ Indians have significantly contributed to mankind’s culture ] = true
  • etc.

For ease in writing and manipulation, we propose to give names to statements. Thus, first declaring

A: California is not a state in the USA

and then applying the Truth-hood operator to “A”, is fully equivalent to applying this operator to the entire sentence appearing after the colon (:) symbol. Thus,

Truth-hood[ A ] <==> Truth-hood[ California is not a state in the USA ] = false


Just a bit of the computer languages theory: terminals and non-terminals:

To take a short-cut through this entire theory, we would like to approach the idea of statements from a little abstract perspective. Accordingly, borrowing some terminology from the area of computer languages, we define and use two types of symbols: terminals and non-terminals. The overall idea is this. We regard any program (i.e. a “write-up”) written in any computer-language as consisting of a sequence of statements. A statement, in turn, consists of certain well-defined arrangement of words or symbols. Now, we observe that symbols (or words) can be  either terminals or non-terminals.

You can think of a non-terminal symbol in different ways: as higher-level or more abstract words, as “potent” symbols. The non-terminal symbols have a “definition”—i.e., an expansion rule. (In CS, it is customary to call an expansion rule a “production” rule.) Here is a simple example of a non-terminal and its expansion:

  • P => S1 S2

where the symbol “=>” is taken to mean things like: “is the same as” or “is fully equivalent to” or “expands to.” What we have here is an example of an abstract statement. We interpret this statement as the following. Wherever you see the symbol “P,” you may substitute it using the train of the two symbols, S1 and S2, written in that order (and without anything else coming in between them).

Now consider the following non-terminals, and their expansion rules:

  • P1 => P2 P S1
  • P2 => S3

The question is: Given the expansion rules for P, P1, and P2, what exactly does P1 mean? what precisely does it stand for?

Answer:

  • P1 => (P2) P S1 => S3 (P) S1 => S3 S1 S2 S1

In the above, we first take the expansion rule for P1. Then, we expand the P2 symbol in it. Finally, we expand the P symbol. When no non-terminal symbol is left to expand, we arrive at our answer that “P1” means the same as “S3 S1 S2 S1.” We could have said the same fact using the colon symbol, because the colon (:) and the “expands to” symbol “=>” mean one and the same thing. Thus, we can say:

  • P1: S3 S1 S2 S1

The left hand-side and the right hand-side are fully equivalent ways of saying the same thing. If you want, you may regard the expression on the right hand-side as a “meaning” of the symbol on the left hand-side.

It is at this point that we are able to understand the terms: terminals and non-terminals.

The symbols which do not have any further expansion for them are called, for obvious reasons, the terminal symbols. In contrast, non-terminal symbols are those which can be expanded in terms of an ordered sequence of non-terminals and/or terminals.

We can now connect our present discussion (which is in terms of computer languages) to our prior discussion of statements (which is in terms of symbolic logic), and arrive at the following correspondence:

The name of every named statement is a non-terminal; and the statement body itself is an expansion rule.

This correspondence works also in the reverse direction.

You can always think of a non-terminal (from a computer language) as the name of a named proposition or statement, and you can think of an expansion rule as the body of the statement.

Easy enough, right? … I think that we are now all set to consider the next topic, which is: liar’s paradox.


Liar’s paradox:

The liar paradox is a topic from the theory of logic [^]. It has been resolved by many people in different ways. We would like to treat it from the viewpoint of the elementary computer languages theory (as covered above).

The simplest example of the liar paradox is , using the terminology of the computer languages theory, the following named statement or expansion rule:

  • A: A is false.

Notice, it wouldn’t be a paradox if the same non-terminal symbol, viz. “A” were not to appear on both sides of the expansion rule.

To understand why the above expansion rule (or “definition”) involves a paradox, let’s get into the game.

Our task will be to evaluate the truth-status of the named statement that is “A”. This is the “A” which comes on the left hand-side, i.e., before the colon.

In symbolic logic, a statement is nothing but its expansion; the two are exactly and fully identical, i.e., they are one and the same. Accordingly, to evaluate the truth-status of “A” (the one which comes before the colon), we consider its expansion (which comes after the colon), and get the following:

  • Truth-hood[ A ] = Truth-hood[ A is false ] = false           (equation 1)

Alright. From this point onward, I will drop explicitly writing down the Truth-hood operator. It is still there; it’s just that to simplify typing out the ensuing discussion, I am not going to note it explicitly every time.

Anyway, coming back to the game, what we have got thus far is the truth-hood status of the given statement in this form:

  • A: “A is false”

Now, realizing that the “A” appearing on the right hand-side itself also is a non-terminal, we can substitute for its expansion within the aforementioned expansion. We thus get to the following:

  • A: “(A is false) is false”

We can apply the Truth-hood operator to this expansion, and thereby get the following: The statement which appears within the parentheses, viz., the “A is false” part, itself is false. Accordingly, the Truth-hood operator must now evaluate thus:

  • Truth-hood[ A ] = Truth-hood[ A is false] = Truth-hood[ (A is false) is false ] = Truth-hood[ A is true ] = true            (equation 2)

Fun, isn’t it? Initially, via equation 1, we got the result that A is false. Now, via equation 2, we get the result that A is true. That is the paradox.

But the fun doesn’t stop there. It can continue. In fact, it can continue indefinitely. Let’s see how.

If only we were not to halt the expansions, i.e., if only we continue a bit further with the game, we could have just as well made one more expansion, and got to the following:

  • A: ((A is false) is false) is false.

The Truth-hood status of the immediately preceding expansion now is: false. Convince yourself that it is so. Hint: Always expand the inner-most parentheses first.

Homework 3: Convince yourself that what we get here is an indefinitely long alternating sequence of the Truth-hood statuses that: A is false, A is true, A is false, A is true

What can we say by way of a conclusion?

Conclusion: The truth-status of “A” is not uniquely decidable.

The emphasis is on the word “uniquely.”

We have used all the seemingly simple rules of logic, and yet have stumbled on to the result that, apparently, logic does not allow us to decide something uniquely or meaningfully.


Liar’s paradox and the set theory:

The importance of the liar paradox to our present concerns is this:

Godel himself believed, correctly, that the liar paradox was a semantic analogue to his Incompleteness Theorem [^].

Go read the Wiki article (or anything else on the topic) to understand why. For our purposes here, I will simply point out what the connection of the liar paradox is to the set theory, and then (more or less) call it a day. The key observation I want to make is the following:

You can think of every named statement as an instance of an ordered set.

What the above key observation does is to tie the symbolic logic of proposition with the set theory. We thus have three equivalent ways of describing the same idea: symbolic logic (name of a statement and its body), computer languages theory (non-terminals and their expansions to terminals), and set theory (the label of an ordered set and its enumeration).

As an aside, the set in question may have further properties, or further mathematical or logical structures and attributes embedded in itself. But at its minimal, we can say that the name of a named statement can be seen as a non-terminal, and the “body” of the statement (or the expansion rule) can be seen as an ordered set of some symbols—an arbitrarily specified sequence of some (zero or more) terminals and (zero or more) non-terminals.

Two clarifications:

  • Yes, in case there is no sequence in a production at all, it can be called the empty set.
  • When you have the same non-terminal on both sides of an expansion rule, it is said to form a recursion relation.

An aside: It might be fun to convince yourself that the liar paradox cannot be posed or discussed in terms of Venn’s diagram. The property of the “sheet” on which Venn’ diagram is drawn is, by some simple intuitive notions we all bring to bear on Venn’s diagram, cannot have a “recursion” relation.

Yes, the set theory itself was always “powerful” enough to allow for recursions. People like Godel merely made this feature explicit, and took full “advantage” of it.


Recursion, the continuum, and epistemological (and metaphysical) validity:

In our discussion above, I had merely asserted, without giving even a hint of a proof, that the three ways (viz., the symbolic logic of statements or  propositions, the computer languages theory, and the set theory) were all equivalent ways of expressing the same basic idea (i.e. the one which we are concerned about, here).

I will now once again make a few more observations, but without explaining them in detail or supplying even an indication of their proofs. The factoids I must point out are the following:

  • You can start with the natural numbers, and by using simple operations such as addition and its inverse, and multiplication and its inverse, you can reach the real number system. The generalization goes as: Natural to Whole to Integers to Rationals to Reals. Another name for the real number system is: the continuum.
  • You can use the computer languages theory to generate a machine representation for the natural numbers. You can also mechanize the addition etc. operations. Thus, you can “in principle” (i.e. with infinite time and infinite memory) represent the continuum in the CS terms.
  • Generating a machine representation for natural numbers requires the use of recursion.

Finally, a few words about epistemological (and metaphysical) validity.

  • The concepts of numbers (whether natural or real) have a logical precedence, i.e., they come first. The entire arithmetic and the calculus must come before does the computer-representation of some of their concepts.
  • A machine-representation (or, equivalently, a set-theoretic representation) is merely a representation. That is to say, it captures only some aspects or attributes of the actual concepts from maths (whether arithmetic or the continuum hypothesis). This issue is exactly like what we saw in the first and second posts in this series: a set is a concrete collection, unlike a concept which involves a consciously cast unit perspective.
  • If you try to translate the idea of recursion into the usual cognitive terms, you get absurdities such as: You can be your child, literally speaking. Not in the sense that using scientific advances in biology, you can create a clone of yourself and regard that clone to be both yourself and your child. No, not that way. Actually, such a clone is always your twin, not child, but still, the idea here is even worse. The idea here is you can literally father your own self.
  • Aristotle got it right. Look up the distinction between completed processes and the uncompleted ones. Metaphysically, only those objects or attributes can exist which correspond to completed mathematical processes. (Yes, as an extension, you can throw in the finite limiting values, too, provided they otherwise do mean something.)
  • Recursion by very definition involves not just absence of completion but the essence of the very inability to do so.

Closure on the “learnability issue”:

Homework 4: Go through the last two posts in this series as well as this one, and figure out that the only reason that the set theory allows a “recursive” relation is because a set is, by the design of the set theory, a concrete object whose definition does not have to involve an epistemologically valid process—a unit perspective as in a properly formed concept—and so, its name does not have to stand for an abstract mentally held unit. Call this happenstance “The Glaring Epistemological Flaw of the Set Theory” (or TGEFST for short).

Homework 5: Convince yourself that any lemma or theorem that makes use of Godel’s Incompleteness Theorem is necessarily based on TGEFST, and for the same reason, its truth-status is: it is not true. (In other words, any lemma or theorem based on Godel’s theorem is an invalid or untenable idea, i.e., essentially, a falsehood.)

Homework 6: Realize that the learnability issue, as discussed in Prof. Lev Reyzin’s news article (discussed in the first part of this series [^]), must be one that makes use of Godel’s Incompleteness Theorem. Then convince yourself that for precisely the same reason, it too must be untenable.

[Yes, Betteridge’s law [^] holds.]


Other remarks:

Remark 1:

As “asymptotical” pointed out at the relevant Reddit thread [^], the authors themselves say, in another paper posted at arXiv [^] that

While this case may not arise in practical ML applications, it does serve to show that the fundamental definitions of PAC learnability (in this case, their generalization to the EMX setting) is vulnerable in the sense of not being robust to changing the underlying set theoretical model.

What I now remark here is stronger. I am saying that it can be shown, on rigorously theoretical (epistemological) grounds, that the “learnability as undecidable” thesis by itself is, logically speaking, entirely and in principle untenable.

Remark 2:

Another point. My preceding conclusion does not mean that the work reported in the paper itself is, in all its aspects, completely worthless. For instance, it might perhaps come in handy while characterizing some tricky issues related to learnability. I certainly do admit of this possibility. (To give a vague analogy, this issue is something like running into a mathematically somewhat novel way into a known type of mathematical singularity, or so.) Of course, I am not competent enough to judge how valuable the work of the paper(s) might turn out to be, in the narrow technical contexts like that.

However, what I can, and will say is this: the result does not—and cannot—bring the very learnability of ANNs itself into doubt.


Phew! First, Panpsychiasm, and immediately then, Learnability and Godel. … I’ve had to deal with two untenable claims back to back here on this blog!

… My head aches….

… Code! I have to write some code! Or write some neat notes on ML in LaTeX. Only then will, I guess, my head stop aching so much…

Honestly, I just downloaded TensorFlow yesterday, and configured an environment for it in Anaconda. I am excited, and look forward to trying out some tutorials on it…

BTW, I also honestly hope that I don’t run into anything untenable, at least for a few weeks or so…

…BTW, I also feel like taking a break… May be I should go visit IIT Bombay or some place in konkan. … But there are money constraints… Anyway, bye, really, for now…


A song I like:

(Marathi) “hirvyaa hirvyaa rangaachi jhaaDee ghanadaaTa”
Music: Sooraj (the pen-name of “Shankar” from the Shankar-Jaikishan pair)
Lyrics: Ramesh Anavakar
Singers: Jaywant Kulkarni, Sharada


[Any editing would be minimal; guess I will not even note it down separately.] Did an extensive revision by 2019.01.21 23:13 IST. Now I will leave this post in the shape in which it is. Bye for now.

Advertisements

Learnability of machine learning is provably an undecidable?—part 2

Update on 23 January 2019, 17:55 IST:

In this series of posts, which was just a step further from the initial, brain-storming kind of a stage, I had come to the conclusion that based on certain epistemological (and metaphysical) considerations, Ben-David et al.’s conclusion (that learnability can be an undecidable) is logically untenable.

However, now, as explained here [^], I find that this particular conclusion which I drew, was erroneous. I now stand corrected, i.e., I now consider Ben-David et al.’s result to be plausible. Obviously, it merits a further, deeper, study.

However, even as acknowledging the above-mentioned mistake, let me also hasten to clarify that I still stick to my other positions, especially the central theme in this series of posts. The central theme here was that there are certain core features of the set theory which make implications such as Godel’s incompleteness theorems possible. These features (of the set theory) demonstrably carry a glaring epistemological flaw such that applying Godel’s theorem outside of its narrow technical scope in mathematics or computer science is not permissible. In particular, Godel’s incompleteness theorem does not apply to knowledge or its validation in the more general sense of these terms. This theme, I believe, continues to hold as is.

Update over.


In this post, we look into the differences of the idea of sets from that of concepts. The discussion here is exploratory, and hence, not very well isolated. There are overlaps of points between sections. Indeed, there are going to be overlaps of points from post to post too! The idea behind this series of posts is not to present a long thought out and matured point of view; it is much in the nature of jotting down salient points and trying to bring some initial structure to them. Thus the writing in this series is just a step further from the stage of brain-storming, really speaking.

There is no direct discussion in this post regarding the learnability issue at all. However, the points we note here are crucial to understanding Godel’s incompleteness theorem, and in that sense, the contents of this post are crucially important in framing the learnability issue right.

Anyway, let’s get going over the differences of sets and concepts.


A concept as an abstract unit of mental integration:

Concepts are mental abstractions. It is true that concepts, once formed, can themselves be regarded as mental units, and qua units, they can further be integrated together into even higher-level concepts, or possibly sub-divided into narrower concepts. However, regardless of the level of abstraction at which a given concept exists, the concretes being subsumed under it are necessarily required to be less abstract than the single mental unit that is the concept itself.

Using the terms of computer science, the “graph” of a concept and its associated concrete units is not only acyclic and directional (from the concretes to the higher-level mental abstraction that is the concept), its connections too can be drawn if and only if the concretes satisfy the rules of conceptual commensurability.

A concept is necessarily a mental abstraction, and as a unit of mental integration, it always exists at a higher level of abstraction as compared to the units it subsumes.


A set as a mathematical object that is just a concrete collection:

Sets, on the other hand, necessarily are just concrete objects in themselves, even if they do represent collections of other concrete objects. Sets take birth as concrete objects—i.e., as objects that don’t have to represent any act of mental isolation and integration—and they remain that way till the end of their life.

For the same reason, set theory carries absolutely no rules whereby constraints can be placed on combining sets. No meaning is supposed to be assigned to the very act of placing braces around the rule which defines admissibility of objects as members into a set (or that of enumeration of their member objects).

The act of creating the collection that is a set is formally allowed to proceed even in the absence of any preceding act of mental differentiations and integrations.

This distinction between these two ideas, the idea of a concept, and that of a set, is important to grasp.


An instance of a mental abstraction vs. a membership into a concrete collection:

In the last post in this series, I had used the terminology in a particular way: I had said that there is a concept “table,” and that there is a set of “tables.” The plural form for the idea of the set was not a typo; it was a deliberate device to highlight this same significant point, viz., the essential concreteness of any set.

The mathematical theory of sets didn’t have to be designed this way, but given the way it anyway has actually been designed, one of the inevitable implications of its conception—its very design—has been this difference which exists between the ideas of concepts and sets. Since this difference is extremely important, it may be worth our while to look at it from yet another viewpoint.

When we look at a table and, having already had reached the concept of “table” we affirm that the given concrete table in front of us is indeed a table, this seemingly simple and almost instantaneously completed act of recognition itself implicitly involves a complex mental process. The process includes invoking a previously generated mental integration—an integration which was, sometime in the past, performed in reference to those attributes which actually exist in reality and which make a concrete object a table. The process begins with the availability of this context as a pre-requisite, and now involves an application of the concept. It involves actively bringing forth the pre-existing mental integration, actively “see” that yet another concrete instance of a table does indeed in reality carry the attributes which make an object a table, and thereby concluding that it is a table.

In other words, if you put the concept table symbolically as:

table = { this table X, that table Y, now yet another table Z, … etc. }

then it is understood that what the symbol on the left hand side stands for is a mental integration, and that each of the concrete entities X, Y, Z, etc. appearing in the list on the right hand-side is, by itself, an instance corresponding to that unit of mental integration.

But if you interpret the same “equation” as one standing for the set “tables”, then strictly speaking, according to the actual formalism of the set theory itself (i.e., without bringing into the context any additional perspective which we by habit do, but sticking strictly only to the formalism), each of the X, Y, Z etc. objects remains just a concrete member of a merely concrete collection or aggregate that is the set. The mental integration which regards X, Y, Z as equally similar instances of the idea of “table” is missing altogether.

Thus, no idea of similarity (or of differences) among the members at all gets involved, because there is no mental abstraction: “table” in the first place. There are only concrete tables, and there is a well-specified but concrete object, a collective, which is only formally defined to be stand for this concrete collection (of those specified tables).

Grasp this difference, and the incompleteness paradox brought forth by Godel begins to dissolve away.


The idea of an infinite set cuts out the preceding theoretical context:

Since the aforementioned point is complex but important, there is no risk in repeating (though there could be boredom!):

There is no place-holder in the set theory which would be equivalent to saying: “being able to regard concretes as the units of an abstract, singular, mental perspective—a perspective reached in recognition of certain facts of reality.”

The way set theory progresses in this regard is indeed extreme. Here is one way to look at it.

The idea of an infinite set is altogether inconceivable before you first have grasped the concept of infinity. On the other hand, grasping the concept of infinity can be accomplished without any involvement of the set theory anyway—formally or informally. However, since every set you actually observe in the concrete reality can only be finite, and since sets themselves are concrete objects, there is no way to conceive of the very idea of an infinite sets, unless you already know what infinity means (at least in some working, implicit, sense). Thus, to generate the concrete members contained in the given infinite set, you of course need the conceptual knowledge of infinite sequences and series.

However, even if the set theory must use this theoretical apparatus of analysis, the actual mathematical object it ends up having still captures only the “concrete-collection” aspect of it—none other. In other words, the set theory drops from its very considerations some of the crucially important aspects of knowledge with which infinite sets can at all be conceived of. For instance, it drops the idea that the infinite set-generating rule is in itself an abstraction. The set theory asks you to supply and use that rule. The theory itself is merely content in being supplied some well-defined entities as the members of a set.

It is at places like this that the infamous incompleteness creeps into the theory—I mean, the theory of sets, not the theory that is the analysis as was historically formulated and practiced.


The name of a set vs. the word that stands for a concept:

The name given to a set (the symbol or label appearing on the left hand-side of the equation) is just an arbitrary and concrete a label; it is not a theoretical place-holder for the corresponding mental concept—not so long as you remain strictly within the formalism, and therefore, the scope of application of, the set theory.

When they introduce you to the set theory in your high-school, they take care to choose each of the examples only such a way that there always is an easy-to-invoke and well-defined concept; this per-existing concept can then be put into a 1:1 correspondence with the definition of that particular set.

But if you therefore begin thinking that there is a well-defined concept for each possible instance of a set, then such a characterization is only a figment of your own imagination. An idea like this is certainly not to be found in the actual formalism of the set theory.

Show me the place in the axioms, or their combinations, or theorems, or even just lemmas or definitions in the set theory where they say that the label for a set, or the rule for formation of a set, must always stand for a conceptually coherent mental integration. Such an idea is simply absent from the mathematical theory.

The designers of the set theory, to put it directly, simply didn’t have the wits to include such ideas in their theory.


Implications for the allowed operations:

The reason why the set theory allows for any arbitrary operands (including those which don’t make any sense in the real world) is, thus, not an accident. It is a direct consequence of the fact that sets are, by design, concrete aggregates, not mental integrations based on certain rules of cognition (which in turn must make a reference to the actual characteristics and attributes possessed by the actually existing objects).

Since sets are mere aggregations, not integrations, as a consequence, we no longer remain concerned with the fact that there have to be two or more common characteristics to the concrete objects being put together, or with the problem of having to pick up the most fundamental one among them.

When it comes to sets, there are no such constraints on the further manipulations. Thus arises the possibility of being apply any operator any which way you feel like on any given set.


Godel’s incompleteness theorem as merely a consequence:

Given such a nature of the set theory—its glaring epistemological flaws—something like Kurt Godel’s incompleteness theorem had to arrive in the scene, sooner or later. The theorem succeeds only because the set theory (on which it is based) does give it what it needs—viz., a loss of a connection between a word (a set label) and how it is meant to be used (the contexts in which it can be further used, and how).


In the next part, we will reiterate some of these points by looking at the issue of (i) systems of axioms based on the set theory on the one hand, and (ii) the actual conceptual body of knowledge that is arithmetic, on the other hand. We will recast the discussion so far in terms of the “is a” vs. the “has a” types of relationships. The “is a” relationship may be described as the “is an instance of a mental integration or concept of” relationship. The “has a” relationship may be described as “is (somehow) defined (in whatever way) to carry the given concrete” type of a relationship. If you are curious, here is the preview: concepts allow for both types of relationships to exist; however, for defining a concept, the “is an instance or unit of” relationship is crucially important. In contrast, the set theory requires and has the formal place for only the “has a” type of relationships. A necessary outcome is that each set itself must remain only a concrete collection.

 

Learnability of machine learning is provably an undecidable?—part 1

Update on 23 January 2019, 17:55 IST:

In this series of posts, which was just a step further from the initial, brain-storming kind of a stage, I had come to the conclusion that based on certain epistemological (and metaphysical) considerations, Ben-David et al.’s conclusion (that learnability can be an undecidable) is logically untenable.

However, now, as explained here [^], I find that this particular conclusion which I drew, was erroneous. I now stand corrected, i.e., I now consider Ben-David et al.’s result to be plausible. Obviously, it merits a further, deeper, study.

However, even as acknowledging the above-mentioned mistake, let me also hasten to clarify that I still stick to my other positions, especially the central theme in this series of posts. The central theme here was that there are certain core features of the set theory which make implications such as Godel’s incompleteness theorems possible. These features (of the set theory) demonstrably carry a glaring epistemological flaw such that applying Godel’s theorem outside of its narrow technical scope in mathematics or computer science is not permissible. In particular, Godel’s incompleteness theorem does not apply to knowledge or its validation in the more general sense of these terms. This theme, I believe, continues to hold as is.

Update over.


This one news story has been lying around for about a week on my Desktop:

Lev Reyzin, “Unprovability comes to machine learning,” Nature, vol. 65, pp. 166–167, 10 January 2019 [^]. PDF here: [^]

(I’ve forgotten how I came to know about it though.) The story talks about the following recent research paper:

Ben-David et al., “Learnability can be undecidable,” Nature Machine Intelligence, vol. 1, pp. 44–48, January 2019 [^]. PDF here: [^]

I don’t have the requisite background in the theory of the research paper itself, and so didn’t even try to read through it. However, I did give Reyzin’s news article a try. It was not very successful; I have not yet been able to finish this story yet. However, here are a few notings which I made as I tried to progress through this news story. The quotations here all come from from Reyzin’s news story.

Before we begin, take a moment to notice that the publisher here is arguably the most reputed one in science, viz., the Nature publishing group. As to the undecidability of learnability, its apparent implications for practical machine learning, artificial intelligence, etc., are too obvious to be pointed out separately.


“During the twentieth century, discoveries in mathematical logic revolutionized our understanding of the very foundations of mathematics. In 1931, the logician Kurt Godel showed that, in any system of axioms that is expressive enough to model arithmetic, some true statements will be unprovable.”

Is it because Godel [^] assumed that any system of axioms (which is expressive enough to model arithmetic) would be based on the standard (i.e. mathematical) set theory? If so, his conclusion would not be all that paradoxical, because the standard set theory carries, from an epistemological angle, certain ill-conceived notions at its core. [BTW, throughout this (short) series of posts, I use Ayn Rand’s epistemological theory; see ITOE, 2e [^][^].]


To understand my position (that the set theory is not epistemologically sound), start with a simple concept like “table”.

According to Ayn Rand’s ITOE, the concept “table” subsumes all possible concrete instances of tables, i.e., all the tables that conceivably exist, might have ever existed, and might ever exist in future, i.e., a potentially infinite number of concrete instances of them. Ditto, for any other concept, e.g., “chair.” Concepts are mental abstractions that stand for an infinite concretes of a given kind.

Now, let’s try to run away from philosophy, and thereby come to rest in the arms of, say, a mathematical logician like Kurt Godel [^], or preferably, his predecessors, those who designed the mathematical set theory [^].

The one (utterly obvious) way to capture the fact that there exist tables, but only using the actual terms of the set theory, is to say that there is a set called “tables,” and that its elements consist of all possible tables (i.e., all the tables that might have existed, might conceivably exist, and would ever conceivably exist in future). Thus, the notion again refers to an infinity of concretes. Put into the terms of the set theory, the set of tables is an infinite set.

OK, that seems to work. How about chairs? Once again, you set up a set, now called “chairs,” and proceed to dump within its braces every possible or conceivable chair.

So far, so good. No trouble until now.


The trouble begins when you start applying operators to the sets, say by combining them via unions, or by taking their intersections, and so on—all that Venn’s diagram business [^]. But what is the trouble with the good old Venn diagrams, you ask? Well, the trouble is not so much to the Venn diagrams as it is to the basic set theory itself:

The set theory makes the notion of the set so broad that it allows you to combine any sets in any which way you like, and still be able to call the result a meaningful set—meaningful, as seen strictly from within the set theory.

Here is an example. You can not only combine (take union of) “tables” and “chairs” into a broader set called “furniture,” you are also equally well allowed, by the formalism of the set theory, to absorb into the same set all unemployed but competent programmers, Indian HR managers, and venture capitalists from the San Francisco Bay Area. The set theory does not by itself have anything in its theoretical structure, formalism or even mathematical application repertoire, using which it could possibly so much as raise a finger in such matters. This is a fact. If in doubt, refer to the actual set theory ([^] and links therein), take it strictly on its own terms, in particular, desist mixing into it any extra interpretations brought in by you.

Epistemology, on the other hand, does have theoretical considerations, including explicitly formulated rules at its core, which together allow us to distinguish between proper and improper formulations of concepts. For example, there is a rule that the concrete instances being subsumed under a concept must themselves be conceptually commensurate, i.e., they must possess the same defining characteristics, even if possibly to differing degrees. Epistemology prevents the venture capitalists from the San Francisco Bay Area from being called pieces of furniture because it clarifies that they are people, whereas pieces of furniture are inanimate objects, and for this crucial reason, the two are conceptually incommensurate—they cannot be integrated together into a common concept.

To come back to the set theory, it, however, easily allows for every abstractly conceivable “combination” for every possible operand set(s). Whether the operation has any cognitive merit to it or not, whether it results into any meaningful at all or not, is not at all a consideration—not by the design of the set theory itself (which, many people suppose, is more fundamental to every other theory).

So—and get this right—calling the collection of QC scientists as either politicians or scoundrels is not at all an abuse of the mathematical structure, content, and meaning of the set theory. The ability to take an intersection of the set of all mathematicians who publish papers and the set of all morons is not a bug, it is very much a basic and core feature of the set theory. There is absolutely nothing in the theory itself which says that the intersection operator cannot be applied here, or that the resulting set has to be an empty set. None.

Set theory very much neglects the considerations of the kind of a label there is to a set, and the kind of elements which can be added to it.

More on this, later. (This post has already gone past 1000 words.)


The songs section will come at the conclusion of this (short) series of posts, to be completed soon enough; stay tuned…

Some running thoughts on ANNs and AI—1

Go, see if you want to have fun with the attached write-up on ANNs [^] (but please also note the version time carefully—the write-up could change without any separate announcement).

The write-up is more in the nature of a very informal blabber of the kind that goes when people work out something on a research blackboard (or while mentioning something about their research to friends, or during brain-storming session, or while jotting things on the back of the envelop, or something similar).

 


A “song” I don’t like:

(Marathi) “aawaaj waaDaw DJ…”
“Credits”: Go, figure [^]. E.g., here [^]. Yes, the video too is (very strongly) recommended.


Update on 05 October 2018 10:31 IST:

Psychic attack on 05 October 2018 at around 00:40 IST (i.e. the night between 4th and 5th October, IST).

 

Are the recent CS graduates from India that bad?

In the recent couple of weeks, I had not found much time to check out blogs on a very regular basis. But today I did find some free time, and so I did do a routine round-up of the blogs. In the process, I came across a couple of interesting posts by Prof. Dheeraj Sanghi of IIIT Delhi. (Yes, it’s IIIT Delhi, not IIT Delhi.)

The latest post by Prof. Sanghi is about achieving excellence in Indian universities [^]. He offers valuable insights by taking a specific example, viz., that of the IIIT Delhi. I would like to leave this post for the attention of [who else] the education barons in Pune and the SPPU authorities. [Addendum: Also this post [^] by Prof. Pankaj Jalote, Director of IIIT Delhi.]

Prof. Sanghi’s second (i.e. earlier) post is about the current (dismal) state of the CS education in this country. [^].

As someone who has a direct work-experience in both the IT industry as well as in teaching in mechanical engineering departments in “private” engineering colleges in India, the general impression I seem to have developed seemed to be a bit at odds with what was being reported in this post by Prof. Sanghi (and by his readers, in its comments section). Of course, Prof. Sanghi was restricting himself only to the CS graduates, but still, the comments did hint at the overall trend, too.

So, I began writing a comment at Prof. Sanghi’s blog, but, as usual, my comment soon grew too big. It became big enough that I finally had to convert it into a separate post here. Let me share these thoughts of mine, below.


As compared to the CS graduates in India, and speaking in strictly relative terms, the mechanical engineering students seem to be doing better, much better, as far the actual learning being done over the 4 UG years is concerned. Not just the top 1–2%, but even the top 15–20% of the mechanical engineering students, perhaps even the top quarter, do seem to be doing fairly OK—even if it could be, perhaps, only at a minimally adequate level when compared to the international standards.

… No, even for the top quarter of the total student population (in mechanical engineering, in “private” colleges), their fundamental concepts aren’t always as clear as they need to be. More important, excepting the top (may be) 2–5%, others within the top quarter don’t seem to be learning the art of conceptual analysis of mathematics, as such. They probably would not always be able to figure out the meaning of even a simplest variation on an equation they have already studied.

For instance, even after completing a course (or one-half part of a semester-long course) on vibrations, if they are shown the following equation for the classical transverse waves on a string:

\dfrac{\partial^2 \psi(x,t)}{\partial x^2} + U(x,t) = \dfrac{1}{c^2}\dfrac{\partial^2 \psi(x,t)}{\partial t^2},

most of them wouldn’t be able to tell the physical meaning of the second term on the left hand-side—not even if they are asked to work on it purely at their own convenience, at home, and not on-the-fly and under pressure, say during a job interview or a viva voce examination.

However, change the notation used for second term from U(x,t) to S(x,t) or F(x,t), and then, suddenly, the bulb might flash on, but for only some of the top quarter—not all. … This would be the case, even if in their course on heat transfer, they have been taught the detailed derivation of a somewhat analogous equation: the equation of heat conduction with the most general case, including the possibly non-uniform and unsteady internal heat generation. … I am talking about the top 25% of the graduating mechanical engineers from private engineering colleges in SPPU and University of Mumbai. Which means, after leaving aside a lot of other top people who go to IITs and other reputed colleges like BITS Pilani, COEP, VJTI, etc.

IMO, their professors are more responsible for the lack of developing such skills than are the students themselves. (I was talking of the top quarter of the students.)

Yet, I also think that these students (the top quarter) are at least “passable” as engineers, in some sense of the term, if not better. I mean to say, looking at their seminars (i.e. the independent but guided special studies, mostly on the student-selected topics, for which they have to produce a small report and make a 10–15 minutes’ presentation) and also looking at how they work during their final year projects, sure, they do seem to have picked up some definite competencies in mechanical engineering proper. In their projects, most of the times, these students may only be reproducing some already reported results, or trying out minor variations on existing machine designs, which is what is expected at the UG level in our university system anyway. But still, my point is, they often are seen taking some good efforts in actually fabricating machines on their own, and sometimes they even come up with some good, creative, or cost-effective ideas in their design- or fabrication-activities.

Once again, let me remind you: I was talking about only the top quarter or so of the total students in private colleges (and from mechanical engineering).

The bottom half is overall quite discouraging. The bottom quarter of the degree holders are mostly not even worth giving a post X-standard, 3 year’s diploma certificate. They wouldn’t be able to write even a 5 page report on their own. They wouldn’t be able to even use the routine metrological instruments/gauges right. … Let’s leave them aside for now.

But the top quarter in the mechanical departments certainly seems to be doing relatively better, as compared to the those from the CS departments. … I mean to say: if these CS folks are unable to write on their own even just a linked-list program in C (using pointers and memory allocation on the heap), or if their final-year projects wouldn’t exceed (independently written) 100+ lines of code… Well, what then is left on this side for making comparisons anyway? … Contrast: At COEP, my 3rd year mechanical engineering students were asked to write a total of more than 100 lines of C code, as part of their routine course assignments, during a single semester-long course on FEM.

… Continuing with the mechanical engineering students, why, even in the decidedly average (or below average) colleges in Mumbai and Pune, some kids (admittedly, may be only about 10% or 15% of them) can be found taking some extra efforts to learn some extra skills from the outside of our pathetic university system. Learning CAD/CAM/CAE software by attending private training institutes, has become a pretty wide-spread practice by now.

No, with these courses, they aren’t expected to become FEM/CFD experts, and they don’t. But at least they do learn to push buttons and put mouse-clicks in, say, ProE/SolidWorks or Ansys. They do learn to deal with conversions between different file formats. They do learn that meshes generated even in the best commercial software could sometimes be not of sufficiently high quality, or that importing mesh data into a different analysis program may render the mesh inconsistent and crash the analysis. Sometimes, they even come to master setting the various boundary condition options right—even if only in that particular version of that particular software. However, they wouldn’t be able to use a research level software like OpenFOAM on their own—and, frankly, it is not expected of them, not at their level, anyway.

They sometimes are also seen taking efforts on their own, in finding sponsorships for their BE projects (small-scale or big ones), sometimes even in good research institutions (like BARC). In fact, as far as the top quarter of the BE student projects (in the mechanical departments, in private engineering colleges) go, I often do get the definite sense that any lacunae coming up in these projects are not attributable so much to the students themselves as to the professors who guide these projects. The stories of a professor shooting down a good project idea proposed by a student simply because the professor himself wouldn’t have any clue of what’s going on, are neither unheard of nor entirely without merit.

So, yes, the overall trend even in the mechanical engineering stream is certainly dipping downwards, that’s for sure. Yet, the actual fall—its level—does not seem to be as bad as what is being reported about CS.

My two cents.


Today is India’s National Science Day. Greetings!


Will stay busy in moving and getting settled in the new job. … Don’t look for another post for another couple of weeks. … Take care, and bye for now.

[Finished doing minor editing touches on 28 Feb. 2017, 17:15 hrs.]

Whassup? “Making room…”

Honestly, for the past quite a few days, I’ve been summarily in that (Marathi) “sun-saan” mood. … Yes, in that mood, and for quite a few days…. Continuously at a stretch, in fact.

Sometime during the initial phase of this mood, somewhere at just the sub-surface level, I did idly think of trying my hand at writing blog posts, just so as to come out of it. Then, exactly at that same sub-surface level, with exactly that same shade of that idle nothing-ness in which I was engulfed, I also saw these thoughts pass me by. …

… It never happens. … I mean, at least with me, it never so happens that I can bring myself to writing something, anything, even just a blog post, when I am trapped in that mood of not wanting to do anything in particular. … I actually end up doing nothing in such times.

No, you can’t call it the writer’s block; it would be too narrow a description. The point is, when it happens, it is “everyone’s and everything’s block.” I mean, at such times, I can’t do even just plain arm-chair thinking. …

Thinking is an active verb, not passive. And, the gloom-some passivity is such that I don’t find myself even thinking about the gloom-some things, even if these go on registering with me. You know, things like the HDD crash, the continuing jobless-ness, etc.

… But, no, nothing happens when I am in that mood. N o t h i n g.

[No, at such times, I am not day-dreaming, even. Not even just hibernating. And, I certainly am not even in that meditative frame either. [I know meditation. I have done it, too.]]

So, all in all, I am being extraordinarily accurate when I say: nothing happens.

This time round, the mood lasted for a few days. Until this morning.

No, no one else had any role to play in my coming out of it. None. None whosoever. I myself did. Rather, I just passively observed myself coming out of it, and then, actually having come out of it. Right this morning. Just a few hours ago.

Yes, before that, I did watch some TV these past few days. But, no, not even retards (or American psychologists) could possibly level the accusation that watching TV lets one “come out” of such moods. Certainly not, when it is me. TV is incapable of affecting me too much, one way or the other. I am being honest here. That’s actually how my bodily constitution is made up like. TV does not affect me too much, for the better or for the worse. It always remains just plain boring, in a mild sort of a way. That’s all.

Anyway, that’s about all I can write about the recent experience, by me, and of that mood.

Now, what is it that I did to come out of… Wrong! Invalid line of thought!!

So, what is it that I did after I came out of it?

I did some search on something and browsed a few URLs. What in particular? I will jot it down right in this post, but before that, allow me a moment to explain the title of this post.

Those among the English-speaking peoples who are fortunate enough to be playing cricket, there is a peculiar circumstance that used to happen in the one-day 50 overs cricket matches, about 20—30 years ago. The circumstance would occur once a match progressed to the late 30s in the overs.

… In terms of overs, the game from about the late teens to the late 30s could easily go replicating my mood above. But, somehow, either the bowlers or the batsmen or both would come out of it, sometimes even in a virtual snap of sorts, though it would happen mostly only gradually, once they arrived in the late 30s in the overs. May be, perhaps, as a result of the spin and the medium-pace bowlers being taken off and the fast bowlers being brought back in, for their second (and last) spell.

Then. Suddenly. Zzzoooooom. A good-length ball, left alone by the batsman (almost as a matter of habit); it safely lands in the gloves of the fumbling wicket-keeper, who should have been prepared but is still taken by a bit of a surprise. Zzzooooom. A second ball, now on the off-stump, swinging ever so slightly out the off-stump. Oh yes! There still is some swing left in this wicket! The batsman does something like waving his bat at it, fumbles, but is lucky enough to survive. Then, a very fast-paced short-length ball, in fact a bouncer! The batsman ducks. The wicket-keeper stretches all the way back, but manages to catch it. Finally, the batsman is found adjusting his gear, esp. his helmet. Yet another good-length delivery, somewhat slower in pace, slightly outside the off-stump, again with just so slight an outswing. Well-collected by the wicket-keeper. No changes in the fielding. And then, finally, comes The Ball. This time, it’s a furiously paced one, right on the leg—yorker! Within a split-second, stepping aside on the front-foot…

The cricket-knowing people [whether English-speaking or otherwise] could easily complete the last sentence above.

Among the commonly available options, the one I like to imagine here is this: Dancing down the wicket, leaving all the stumps in the open, the batsman makes room for himself, and hits at the ball hard, with his full strength. The ball connects with the meat of the bat, and the next instant, it is seen racing past the extra-cover boundary. No, you have not been able to catch how or where the ball went, really speaking. All that your visual field has in the meanwhile registered is the whitish figure of that fielder in the covers first rising up in a contorted fashion in the air, with both his hands wildly outstretched out and up, and just when that slim figure of that talented fielder begins—almost as if unbelievably by now—to go down, you instinctively strive to look past beyond him. And then, you see it. The ball has taken the first harsh bounce past the fielder, not caring a whit for the grass, and it is now racing… no, in fact it already has gone past the boundary line, for a four… [To me, such a four is more appealing than an artful but wily hook off a leg-side delivery for a sixer. The latter somehow appears almost meekish, as compared to this brawnishly—even if not very artfully—executed cover drive. That is, when such a cover drive is an answer to a yorker. Even if the yield to the side is only 4, not 6.]

Well, I left watching cricket roughly in the mid-1990s. When someone says “cricket” or “cricketers,” about the only match that I somewhat remember (after a 5–10 second gap or more), or the “last” complete match I probably saw, was the one in which both Rahul Dravid and Sourabh Ganguly were either brand new, or at the most only 2–3 matches old. I haven’t watched much cricket after that. May be two or three matches (in full), and some more matches, some half-way through or so. None of these matches, I remember any more. And, I am completely certain that (except for some irritating times when I am only gone to a restaurant for a drink and some good food, and yet, cricket finds a way to pounce on me off a big glaring screen) I have not watched any cricket over the past 10+ years. Whether India was playing Pakistan or not. Whether Sachin Tendulkar was in form or not. … You get the idea.

But still, some visuals and phrases have remain etched in the memory. One of them is: “Making room for himself.”

If the going has not been so good, and yet, it has also not been very particularly bad either, if you have been in that greyish or slumberish sort of a mood of not wanting to do anything, or in an even worse mood: in that (Marathi) “sun-saan” mood, then: Once you find yourself having come out of it, the first thing you gotta do, IMO, is: Make room for yourself.

So, I did. Right this morning. (A few hours ago.)

I re-realized that one application of CFD is in the computer graphics and games programming field. (I had well-realized this point in the past, of course, but all the downloaded materials and sources had gone in that HDD crash.)

So, this morning, I spent some time browsing the ‘net for CFD simulations for computer graphics. … Interesting!

I need to add “Fluids simulation for computer graphics” as one of my active research interests, while updating my resume.

Should I conduct a course on Fluids Simulation for the CS folks, esp. for those working in the IT industry in Pune or Mumbai? Would someone be interested? Drop me a line. I am all ears. And, I am serious. (I will even simplify the maths while presenting the topics, and I will also supply some elementary codes. The students must, however, bring both their laptops and minds to the class.))

Let me know—before I possibly slip back once again into that yawnish/slumberish/worse—(Marathi) “sun-saan” (i.e. in English, roughly, tomb-silent) kind of a mood…

* * * * *   * * * * *   * * * * *

A Song I Like:

(Hindi) “jeenaa ye koi jeenaa to nahin”
Singer: Shailendra Singh
Music: Bappi Lahiri
Lyrics: Gulshan Bawra

[Yes, IMO, here, it’s Shailendra Singh’s version that easily outdoes Lata’s. [But then, honestly, isn’t the tune here better suited to a male voice?]… And, yes, IMO, here (as also on many other under-appreciated occasions), this “RD Clone” has managed to actually deliver on the goods! When I was in college, the intellectuals back then had this tendency to smirk at even just a passing mention of Bappi Lahiri’s name. But, even back then, I would think that he didn’t deserve it—even if he indeed was, for much part, an RD (and many others) Clone. Yes, I would also air this opinion of mine, back then. Anyway, this is certainly one of his best songs; see if you like it, too.

And, if you do, notice two points: (i) Consider the tune and the music for the “antaraa” part, esp. near the end of the first half of the “antaraa.” That is, the point where the line “bahon mein jab ho baahein” (in the first “antaraa”), or “gar koi yaar naa ho” (in the second “antaraa”) ends. Now, stop here. You already know the “mukhaDaa”. So, think for a moment, how you would land at the repeating “mukhaDaa,” starting from this point in the “antaraa.” Think about it for a while. You can easily think of connecting these two points in some melodious way, perhaps even in many different ways—the tune is simple enough that even a layman could easily attempt doing that. Or, if you cannot imagine any ways to make the connections, then at least spend some time imagining how most of the well-regarded music directors (including RD) might have habitually connected the two. Then, consider how the transitioning actually occurs in this song. I bet that all the imagined transitionings would be far more direct than what happens to be the actual case here. … The most beautiful path isn’t always the shortest one. … Here, the music takes something of a little detour, choosing to make the transition at the more lingering and meandering “o mere saathi” phrase, rather than at any other possible connecting phrase. …. It’s (at least) this transitioning here—from the half point in the “antaraa” to the appropriate phrase in the “mukhaDaa”—that should have left no doubt even in an intellectual’s mind that, yes (even) Bappi Lahiri is, actually, a gifted composer. (ii) Another point. This is a bit silly, but since I am in a mood today to write at length without saying much anything, let me continue. Try humming the “mukhaDaa” of this song, starting with the “jeenaa ye koi” line, but without using any words. Attempt just humming. (Or, whistling.) You would find that you can easily do it—humming the entire “mukhaDaa” well. Now, try adding words to your mere hummings. Then, compare the way you sang the words of the “mukhaDaa,” with the (superlative) way in which Shailendra Singh has actually sung it. In particular, notice how easily, softly—in fact almost imperceptibly—he utters, but swiftly passes over, the word “ke,” while singing the phrase “pyaar ke binaa”. And, how you fumbled at this particular place, when you were asked to sing it aloud. …You mean to say, you had never tried it before? Go ahead, give it a try! It’s fun!]

[An editing touch is sure due; may or may not get effected. Done. Expect more posts of a similarly long-winding and pointless nature, at least in the near future.]

[E&OE]

Translation seen as a probabilistic process

These days, I have been writing what effectively has turned out to be a series of posts on the different broad views of the world implicitly assumed by physicists, over the course of the development of the subject. … The last time I had ended up de facto having a series at this blog—without planning for one—was the one on the diffusion equation: the Brownian movement, Fourier’s theory, and all that; see here [^], [^], [^] and here [^].

That topic has become once more relevant, because of a certain peculiar nature of Fourier’s theory. In the present series, we have seen how the phenomenon of gravity throws a spanner in the neat scheme we might first think of, for the first world. … We had to split our first world (that corresponding to the Newtonian mechanics) into two, only because of this odd man out of gravity. Further, as already noted earlier in this series, Fourier’s theory, too, throws a proverbial spanner into the scheme of our first world, because unlike the rest of it, Fourier’s is a theory that necessarily involves a physical instantaneous action-at-a-distance.

I wanted to write on that topic, i.e., on the nature of the Fourier theory and whether or how it necessitates further splitting up of our neat first world. In fact, I also have a few other posts coming up very soon on the related topics. At least one, or perhaps two, will be on the nature of space, the physical universe, and the relation between the two. For instance, issues like: what is the meaning of the concept of space, the question of whether the space is infinite or not, whether the [physical] universe is infinite or not, etc.

However, certain comments, once again, interfered.

I received a few comments written in some seemingly hieroglyphic-kind of a language. For instance, this passage (given below). Since I could not make the head or the tail of it, what I did was to open up the Google Translate page, copy-paste the comment material, and try to make a sense of the comment. Here are the passages:

An excerpt from the original (which is, according to the Google Translate, in Japanese):

“Pythonの冬の数ヶ月の葉で作られたとみなされているレザーベースのトリム識別名ブランドやエンブレム次の裏地を持っている。嵐もうすぐキャンプrrnsideできるだけあなたの特別なバッグ “ウエット”パックでは、家に帰ってくるとすぐに溶液を除去し、空気乾燥しているのに役立ちます。ワイプは、それを維持する前に、すべての残骸を立ち上げました。それは屋外ながら、毎日この問題の世話をする場合に最適であることを起こる。そこにいくつかの犬の所有者があり、すべての立ち上げ、犬の糞袋を分解することができる使用して確立された場合、それらはすべての貢献との違いを作成しません。彼らは写真の犬を散歩されるべき電話番号犬は実際に、それはすべての所有者は、子犬うんち袋を使用されるかどうか、特に、地球に影響を与えることができます。”

The translation provided by the Google Translate:

“You have the lining of the emblem and trim next distinguished name brand of leather -based that is considered to be made ​​of leaves of the winter months of Python. In a special bag ” wet ” your pack as much as possible camping rrnside, helps to remove the solution immediately , to specify that air dry and come home soon storm . Before you can keep it , wipe , launched debris all . It happens to be while outdoors , it is best if you want to take care of this problem every day . If it is established using may have the owner of some dogs there, up to all , to decompose the droppings dog bag , they do not create a difference between all contributions . Phone number dog should be walking the dog in the picture actually , the owner of all , whether you are using the bag poo puppy , in particular , will be able to affect the earth it is they .”

Thank you, Google! … What you did for me was not just to provide a translation but also the impetus for me to pursue some ideas of my own… Your offering has inspired me to pursue a few things I wouldn’t have normally thought of trying, for a blog post. …

… Here, I immediately remembered a certain blog-discussion which I had some time ago at the “Applying philosophy…” blog. That discussion was about, among other things, the philosophic issue of the mind-body dichotomy. … In particular, I remembered that I had supplied a translation of a wonderful piece of poetry by Gulzar. (Since it actually is a song from a Hindi film, technically, it should be called a piece of lyrics. Technically.) The song in question is: “ham ne dekhi hai un aakhon se…”

Here are the lyrics of the song, copy-pasted from the “HindiLyricsPratik” blog, here [^]. Those who know Hindi would know why I call it a poem:

“हमने देखी है उन आँखों की महकती ख़ुशबू
हाथ से छू के इसे रिश्तों का इल्ज़ाम न दो
सिर्फ़ एहसास है ये रूह से महसूस करो
प्यार को प्यार ही रहने दो कोई नाम न दो

प्यार कोई बोल नहीं, प्यार आवाज़ नहीं
एक ख़ामोशी है, सुनती है, कहा करती है
न ये बुझती है, न रुकती है, न ठहरी है कहीं
नूर की बूँद है सदियों से बहा करती है
सिर्फ एहसास…

मुस्कुराहट सी खिली रहती है आँखों में कहीं
और पलकों पे उजाले से झुके रहते हैं
होंठ कुछ कहते नहीं, काँपते होंठों पे मगर
कितने ख़ामोश से अफ़साने रुके रहते हैं
सिर्फ़ एहसास…”

Here is the translation that I had offered during that blog-discussion of ours. In my translation, I had tried, as usual, to be as close to the original Hindi as possible. Here it is. [However, please note, I now have modified my original translation quite a bit further here—I got access to a better transliteration of the lyrics of songs, and also thought of somewhat better words with which to translate them.]

[For those who don’t know Hindi, realize that the speaker here is a lady—the heroine of the film—talking through this song to the hero]:

“I have seen that fragrance emanating from those eyes [of yours],
[And now] with this touch of a hand, don’t lay upon this all the accusation of a relationship,
This is [meant to be] just an awareness, experience it with the soul,
Let love remain love, don’t give it any name.

Love is not some talk, love is not sound [or something said aloud],
It is a silence which listens, and [also] is in the habit of saying [something],
It does not extinguish, nor does it stop, nor has it stayed put in some [one] place,
It’s a drop of light, it has been flowing since ages.
This is [meant to be] just an awareness….

Something like a laughter keeps blossoming somewhere in eyes,
And, at the eyelids, some kind of effulgences stay [back as if] bowed down,
The lips don’t say anything, but still, on those quivering lips,
How many silent stories are kept holding back.
This is [meant to be] just an awareness….”

BTW, if you are interested in another take at translation, see R. S. Khanna’s blog here [^]. In this revision of my translation, I found both Khanna’s transliteration and translation quite helpful (e.g., “quivering” now replaces my original “shivering,” etc.) [My original attempt can still be found intact at the above-mentioned link to the “Applying philosophy…” blog.]

I have a PhD in Mechanical Engineering, and also, a Diploma in Advanced Computing, apart from quite a bit of experience (some 10+ years) in software engineering. But, anyway, I am jumping ahead of myself here. First things, first. So, here is the Google translation of the original Hindi:

“We have seen those eyes sweet scented fragrance
Let ‘s not blame it touches the hand of relationships
Only to realize Feel the spirit
Give a name to love Let there be love

No Love Lyrics , Love Bleep
A silence , listens , said that
This is not quenched , neither stops nor is staying somewhere
The blob is poured over the centuries Noor
Just realized …

Nowhere is blossoming like a smile in the eyes
Light on the tip of the eyelids
Lips do not say anything , but on the lips quavering
How many stayed silent and live Afsane
Just realized …”

Thank you, Google!

* * * * *   * * * * *   * * * * *

But what was that bit about my educational qualifications and experience that appeared seemingly out of nowhere right in the middle, you ask? … Good question. And, I also have yet to come to the “probabilistic” part of the title.

…Well, the thing is, I do understand something about the probabilistic but mechanical way in which the Google Translate (GT) works. Its mechanism. It is not via the usual route of a semantics-based translation, e.g. some NLP (natural language processing) algorithms or so! Instead, what it does is the following. … Suppose you want to translate a passage from Hindi to English. All that GT does is to first…. Oh well! Wait!! … Geoffrey Pullum does a far better job of explaining it than I could ever hope to do, and so, let me direct you to him right away [^] (HT to Abi [^]).

* * * * *   * * * * *   * * * * *

But this post is not over… Not yet…

Here is a piece of Hindi poetry that has recently appeared on the “Applying philosophy” blog, here [^]:

“न थी इजाज़त, जो देखने की,
कुछ ऐसे सपने सजाये मैंने,
जो चाहता हूँ, भुला दूं उनको,
बिना इजाज़त सता रहे हैं

न थी इजाज़त, जो बोलने की,
वह बात दिल में बसाई मैंने,
जो चाहता हूँ, निकालूँ उस्को,
बिना इजाज़त बसी हुई है

न थी इजाज़त, जो चाहने की,
वह चाह दिल में जगाई मैंने,
जो चाहता हूँ, न चाहूं उस्को,
बिना इजाज़त जला रही है”

Great poetry, though I haven’t [yet] tried translating it into English. But that’s besides the point. Update: Here is my translation:

“Permission:

there was no permission, to see [some] such [things],
some such dreams I embellished,
what I [now] wish, [is that] I should forget them,
without permission they are annoying [me].

there was no permission, to speak of [some] such [a thing],
[it is] that thing [which] I [let be] inhabited in [my] heart,
what I [now] wish, [is that] I should remove it,
without permission, [it has] stayed on.

there was no permission, to desire [some] such [a thing],
[it is] such a desire [which] I awakened in my heart,
what I [now] wish, [is that] I should not seek it,
without permission, [it] is burning [me].”

But the real point, here, of course, is: the Google translation:

“Did not permit the viewing,
Decorated I had some dreams,
You want me to forget them,
Leave without picking

Was not allowed, speaking of which,
I have settled in the heart of the matter,
Who want to remove Usco,
Without permission is inhabited

Was not allowed, which is wanting;
I want to wake in the heart,
Who want Usco not want,
Without permission is lighting”

…Oh, lines… Oh, that line at the end of the first stanza! … And, oh, that “Usco”!! … And, oh, I almost forgot this: Hey, thanks, really, Google!

And, BTW, that comment in Japanese—the one which provided the spark to write this post in the first place—has now been taken off the spam filter, and been approved already… Thank you, Internet! Thank you, Al Gore!! And, thank you, you all!!!

* * * * *  * * * * *   * * * * *

A Song I Like:

(Hindi) “hum ne dekhi hai…” [And what else did you think it would be?]
Singer: Lata Mangeshkar
Lyrics: Gulzaar
Music: Hemant Kumar

[E&OE]

An idea for the final year student projects in CS departments

In this post, let me jot down an idea that is suitable for a final year project by the CS/IT undergraduates (or, in a suitably expanded form, also by the CS/IT post-graduates) in engineering schools. Or, students from other similar courses like the MCA, MSc (CS) etc. The idea is (to me) neat and useful, and not at all difficult to implement. Do consider it in place of those usual topics like railway reservation/airline booking system, payroll system, etc.

The idea is to create an online searchable database of the history of science topics, that gives out customized timelines as query outputs or reports over the ‘net.

The idea occurred to me while doing literature search for my book. (BTW, idiots in those humanities departments degrade the word “research,” by using it for such things as plain literature search. The management and marketing folks do worse: they call activities like the Gallup polls, “research.”)

So long as a time-line involves only tens of items, it is easy to manage its contents manually. As an example of a small time-line, see my previous blog post. (In fact, I got the items in that time-line in an almost fully ready-made form from Manjit Kumar’s book: “Quantum.”)

However, while preparing for the pre-requisites part of my planned QM book, I found that no suitable time-line existed in a ready-made form for those classical physics topics.

One reason is that the pre-requisites of QM are so very many and so very diverse. For a neat visualization of physics as it existed some 70 years ago, see the “1939 map of physics”[^].

Another reason is that the sources of the detailed historical accounts are, again, so very numerous, and not all of them are equally accurate, authentic, or detailed.

While accuracy and authenticity are important, as far as the design work for this particular project goes, I think that the issue of the non-uniformity of the available details is of much greater importance.  The original sources of information themselves are not equally detailed.  For instance, see the Wiki timeline on the luminiferous aether [^], and ask yourself: would any one of the Encyclopedia Britannica’s timelines concerning the aether ever be as detailed as this one?

Of course, the administrators of the database may not rely only on the existing timelines. They may consult books to feed the data to create more detailed timelines. Once again, issues like the unevenness of details and the matter of ascribing proper dates at a detailed level, come up.

For instance, consider the Fourier analysis. Most books mention 1822 as the date of its beginning. It’s wrong.

We don’t have any specific record to answer the question of the precise time when Fourier first thought of this idea of spectral decomposition. However, we do have some evidence which indicates that he began working on the problem of heat propagation in solid bodies as early as 1804, and that he had presented a paper on this topic to the Paris Institute as early as on 21st December 1807. Though Fourier wanted to publish the paper in print, he didn’t (or perhaps couldn’t) do so, because it ran into some criticism. (Off hand, I would suppose, from Poisson.) Fourier had only the sines, not also the cosines, in that first 1807 paper of his. Though important, relatively speaking, it’s a minor omission. What is more important is the fact that he had already had the big idea right, viz. that any arbitrary function could be expanded as (possibly an infinite) series. Now the funny thing is this. While Fourier didn’t know the necessity to include the cosines, neither did his critics. They were rather against the completeness aspect of his idea, viz., that any arbitrary function could be expanded in this way, even the discontinuous functions. On that count, he was on the right track. However, it was hard to convince the established critics. One way or the other, the paper didn’t get published that same year. He sent a revised draft to a competition of the French Academy in 1811, and won it.  The critics, however, still were not satisfied. Fourier ultimately wrote and published the further revised account some 15 years later, in 1822.

Most books will tell you that the Fourier analysis began in 1822. However, unless you take into account the 1804–1807 genesis of this branch of analysis, the overall historical progression does not appear as smooth as it should be.

The question then becomes, in a time line, what date would you include for reporting? 1804? 1807? 1811? 1822? All the four?

Some cases are even more intriguing, and possibly of more than trivial interest to a researcher. For instance, consider Schrodinger’s formulation of quntum mechanics. His very first paper (written in January 1926) is concerned about the famous non-relativistic wave equation, now bearing his name. Yet, even before that very first paper was written, in fact barely just a month before, in December 1925, Schrodinger had initially worked on a relativistic wave equation. It doesn’t work fully satisfactorily, and so Schrodinger worked out the non-relativistic version before sending it out for publication. But wouldn’t the fact that Schrodinger had those relativistic ideas right in the last month of 1925, be important while tracing the development of the relativistic QM? After all, physicists in that era were unusually open in their communications and collaborations. For example, Pauli had begun working on his paper merely on the basis of a draft of Heisenberg’s paper. (I mean the manual draft as well as detailed physics notes Heisenberg exchanged with him—not even the printer’s proofs let alone what today we would call the preprint.) Similarly, Pauli’s (1924?) idea regarding a two-valued quantum number was known to Goudsmit and Uhlenbeck when they proposed the electronic spin (October, 1925).

Should such details be entered into the database? Should there be an attribute specifying the granularity of information that is being carried by a database entry? Should the queries allow various levels of granularities to be included in the search results?

And, going back to Fourier’s case, suppose that your database initially includes an entry for only one of the three dates (1804—1807, 1811 and 1822). Someone points out the other dates to you (i.e. to the system administrators populating the database) only later on. How would you allow for the conciliations of data items like these, in a mechanical sort of a way? What software mechanism would you provide?

Would you handle it via some attributes of the individual entries  in the database? Would these attributes be finalized beforehand, say a pre-determined set like: “Initial Idea/Presentation/Publication/Revision” etc? Or would it be possible and desirable to keep the set of such attributes extensible—say, on lines similar to how you apply attributes or tags to your blog posts, complete with an intellisense feature that shows the already applied attributes? Can the overall structure thus be kept a bit fluid? Is a relational database the optimal data structure, given such objectives? How about the queries on it? Should they be sensitive only to the attributes, or to the “full text”?

I think these are just some of the questions that the students will themselves have to handle. Here, all that I am now going to do is to give some idea of how the usage scenarios would look like. In my description here, I will use the terms like database etc., but only in a broad sense—it doesn’t necessarily mean a traditional data structure like the relational database.

First, there will be a set of users who will populate the database using appropriate sources. They will have the authority to edit the database, and may be called system administrators. The database may sit on a remote secure Web server. The sys admins should be able to use, say, secure http-based (i.e. https-based) forms to edit the database, i.e. to add/modify/delete the entries.

Then there will be the end-users. The end-users should be able to access the database using plain http. Typically, they will use a user-friendly form to send their queries to the server over plain http, and get answers as reports consisting of Web pages. [More on this, in my next post.]

The database will, at the minimum, have information on the following attributes. (If this were an RDBMS, it will have the following as fields or columns): date, scientist, work, remarks, source reference (for the database entry—not the citation record itself), subject areas to which the entry belongs. Many of these attributes are tricky.

There are multiple meanings of “date” in a context like this. Some historical sources give dates only in years, others also in months, still other right up to the day and the time. For some advances, the dates are known only vaguely. For example: “He began working on it during the late 1820s, and continued up to 1826-end” or “fl. 1810s”, or “circa 3rd quarter of 1925”. The calender systems may have changed. Then, there are those BC dates. (Not all database software handle all the dates very smoothly anyway.) Sometimes, the date a scientist sent his manuscript may be more important than when it was published, but not always. Sometimes, the published accounts may be somewhat misleading: Newton did get the basic ideas of calculus right during that plague-related vacation, around 1666. Yet, it also is a fact that he still was working on many of the important calculus ideas in the few years before the publication in 1687 of the Principia. Newton didn’t have everything ready right in 1666, contrary to many stories you might have heard. (There has been a tendency to ascribe too much of a genius to the years of the early youth; wit a dim-wit like Hammings—dim-wit, when it comes to the age at which scientists created their best original ideas.)

The database must store the precise way the date was stated in the original sources. Yet, when a user sends in a query, regardless of the specific original recording, it should still be possible to sort out all the entries, with a certain smartness built into the system. The student will have to resolve the issue of how to put all the entries in (some sort of an) order—vague entries like “in the decade prior to the I WW” together with the more precise entries like “in the afternoon, after lunch, on October 7, 1900.” (Anyone knows what’s special with that time? I would be delighted to know from you!)

As far as this problem of sorting of entries of varying degrees of precision of time-specification is concerned, I do have an idea how it could be done. However, I would rather let the students try to work on it. [OK, I promise to give some hint in my next post on the topic.]

The “scientist” field may undergo changes. A plain guy like Thomson is the same as (the first Baron/Lord) Kelvin (after whom the temperature scale is named). “Snellius” is the same guy as Snell (of the laws of optics). A real gauntlet to us in the 21st century was thrown by a single Swiss family: There is a Daniel who, being one of a family, carries the same last name as Jacob, who is the same as Jakob is the same as Jacques is the same as James.  It doesn’t end there. Johann, is the same as John is the same as Jean, but different from James. And then, there are Johann-II and also Johann-III. Worse: Sometimes they had a feud in the open. The famous brachistochrone problem was posed because one brother tried to show the other brother down. (The other brother succeeded, together with Lebniz and his credits-wise adversary, the “lion’s paws”-bearing Newton.) There has been a son in that family who thought his father had taken credit really due to him. The point? Your user should not get confused, and should be given all the relevant information as to which Bernoulli it was.

Some users may be interested in getting a time-line of all the works of just “Newton.” Others may wish to access a similar information for “Isaac Newton, Jr.” … Yes, the falling apple guy was named after his father, though few people know it, and even fewer (if ever) add the “Jr.” suffix to his name. A more likely search string, especially if originating from a tiny island lying above France and to the west of, say, Finland, might be: “Sir Isaac Newton.”

The “work” field needs to have both succinct information (so that the idea of a time-line or an Excel spreadsheet like report makes sense), but also a place for additional informative details. For instance, less known or less advertised factoids like the fact that Leibniz didn’t originate the idea of the “vis viva,” but was inspired by its descriptions in Huygens’ writings. The additional details should perhaps be supplied only in a more verbose report, but the point is that the database itself should be well-designed that there is a systematic and unambiguous way to tell such things if the need be.

The subjects attributes is important. It identifies all the subject areas in which a given advance of science falls. It will enable the end-user to run subject-specific queries. For example, a query like: “Give me a time-line of the kinetic theory,” which is slightly different from “Give me a time-line of the statistical mechancis,” which is slightly different from “Give me a time-line of the thermodynamics.” Contrary to the current pedagogy followed in the engineering schools, development in statistical mechanics was almost concurrent to thermodynamics. … Incidentally, thermodynamics didn’t exactly begin with Joule. It had sprung into action right after Watt’s engine, with Lazare Carnot, the father of Sadi Carnot, having a lot of thoughts related to the energetics program in physics. In case you have ever wondered how come the Second Law of Thermo came on the scene before the First one did, such a bit might be of interest to you.

Thus, the subject attributes may be both coarse-grained or fine-grained, and they may have multiple abstract hierarchies among them. The system administrators may invite subject experts (e.g. professors) to apply these attributes. This task, though complex and time-consuming, should not too difficult in a way: the attributes to use may come from certain standardized classification schemes such as the PACS classification [^], the AMS classifications [^], etc. What is more important for this project on the CS side is this requirement: the application mechanisms of subject attributes should be sufficiently neat that the application people—the subject experts—should find it very easy to find all the relevant attributes suggested to them in a handy way, when they apply these attributes to the individual entries.

As indicated above, one real tricky point is this: Some discoveries/inventions span across fine-grained descriptions. Another tricky point is the following: Some discoveries span across many fine-grained description only in the context of a later discovery—not otherwise. Therefore, they should not get applied for reports-generation unless the later discovery also is included in the search results. For instance, consider the topic of Fermat’s principle. This is a from classical, geometric, optics. It would not have been a potentially very important candidate for inclusion in mechanics-related searches until Schrodinger made it impossible to avoid it. … Actually, the idea of looking at Fermat’s principle as a mechanical principle goes centuries back; off-hand, I suppose it was Huygens (again), right in the 17th century, who first linked the mechanics of transfer of momentum (he would call it “the motion”) and Fermat’s principle. Then, in the 19th century, there was Hamilton who got inspired by the same conceptual link. But the point here is, before Schrodinger, not identifying the topic as a mechanical one, could, perhaps, have been an excusable omission. After Schrodinger, it is entirely inexcusable.

Each entry should clearly and explicitly identify the source(s) of the information on which it is based. For instance, McTutor’s, Encl. Brit., a certain book, a certain paper, etc. There could be multiple original sources objectively required for the same entry. For instance, when a subject expert rightly ascribes to Euler the impetus provided for development of the calculus of variation (CoV), he would need to take it away from Maupertuis, and for the same reason, he would need original sources authored by both.

BTW, this entry (on the impetus to CoV) would be different from another entry that is concerned with the first correct formalization of the CoV, which would involve a 19-year old Lagrange. And, both these entries would be different form the first-ever problem of CoV correctly posed and applied with correct a correct solution approach, which would be a credit due in (off-hand speaking, a 50+ year-old) Newton’s account (more than a century before Lagrange’s work).

Thus, the individual database entries should carry links to other, related, entries. Such things may come in the Remarks section, or there could be a special “See Also” section.

On the second thoughts, there could an extra attribute (to be carried by a database entry) for specifying the set of pre-requisite entries for it.

Further on the second thoughts, there could also be another, separate attribute that specifies a linkage to the other entries immediately preceding it and constituting the “prior development” for a given entry. (The prior development is not necessarily the same as a pre-requisite. For instance, the caloric fluid theory of heat is a prior development to the kinetic theory, but it’s not a pre-requisite.)

All the report fields may carry hyperlinks to the resources on the ‘net.

It might be desirable to include other information like the biographical details for scientists: dates (!) of birth and death, places thereof, nationalities (e.g. Prussia, West Germany, United Germany, etc!), educational institutions attended and degrees, if any, received, the career-wise affiliating institutes for each database entry.  (Consider: would the work as a patent clerk qualify as an affiliating institute for a discovery in physics?) Details like these are not so important, and may be taken up during the version 2.0 of the project.

Finally, I would like to jot down a couple of resources. Informations from multiple sources like these needs to come together in a single, easily accessible database:

The McTutor Chronology [^]
The Wiki List of Timelines [^] (We need only the S&T related timelines. For version 1.0, focus only on physics and mathematics. For version 2.0, add: inventions/patents, engineering and technology. For version 3.0, add: biological sciences. For history of the humanities kind—who killed who to grab which power when and had how many wives—my advice to you would be: don’t bother.)

Now, a few things of business, if you are interested in picking this up as a project.

Don’t get in touch with me simply to ask if you can use this idea. Of course you can, so long as you acknowledge the intellectual credit due to me, and so long as you don’t release this idea into the Open Source/CopyLeft sort of a domain, and so long as you are not going to make any money out of it—I reserve all the rights to this idea in the first place.

However, do drop me a line before you begin any work on this idea. To repeat, I do reserve all the rights to this idea in the first place.

Also, if you are a student, don’t get in touch with me to ask if I could be become a guide to you. I won’t. What I have done here is to give you the basic idea in general (but in enough details). The particular fleshing out and the particular implementation is basically between you and your official college guide. This much should have been obvious, but the politics junkies and/or IB and/or CIA etc. have used a trick in the past, simply to harass me. For instance, when I was applying for a job as a professor to a college in Pune, someone from Gadchiroli or Ichalkaranji, claiming to be an undergraduate student, would call me, ask me if I can go to their place to be a guide—but could not name the name of his college principal.

However, once your official guide himself approaches me—which he must, if you are going to work on this project idea (see the above point again)—at that point, if he himself requests me to be an official co-guide, I wouldn’t necessarily mind—I will think about it, and let him know regarding my decision in this respect.

Finally, if you are a student, don’t get in touch with me to ask me if I could implement this project for some money from you. I won’t. It is true that as of today I am jobless, and that, even otherwise, I am always looking for ways to make money. But, I have not, don’t, would not ever sell finished (or even “half”-finished) projects to students for a charge.

I would rather write blogs and go further and harass the Maharashtra CM (any one occupying that post), possibly to (his or her) death, (correctly) blaming him (or her or they) for my joblessness, than start selling student projects for a charge. The first should come easier to me. That’s why.

Now, see if you wish to pick it up as a project. It will be useful. To a great degree. (And, perhaps, for this same reason, it won’t be suitable as a JPBTI project. Though, I couldn’t care less if IIT students pick it up as their project topic.)

As I said, there are a few more things about this project that I could say. I will write another blog post about it.

* * * * *   * * * * *   * * * * *

No “A Song I Like” section, once again. I still go jobless. Keep that in mind—if your mind is not so small as not to have the capacity to carry this factoid.

Prithviraj (the BITS Pilani- and Berkeley-trained CM)? Jairam (the IIT Bombay- and MIT-trained JPBTI)? Others like you? Ashamed of the fact that I still don’t have a job? Or not so? Or, perhaps, not at all so? What say? Not even on a Gandhi Jayanti day? Have borrowed that convenient “maun vrat” from a certain someone from Ralegan Siddhi? Really?

[This is revision 1, published on October 2, 2012, 12:35 PM IST. Initial draft posted on October 1, 2012, 4:10 PM, IST. I think I am done with the additions/clarifications for this post, and probably won’t come back to further update this post (unless a grammatical error is too glaring). The promised related post should appear in a few days’ time.]
[E&OE]

Magic, NP-Complete Problems, and Mathematical Modeling

Let’s begin by taking an illustrative example. You might have seen a magician perform a trick in which he mysteriously makes a coin disappear from one hand and makes it appear on the other. The trick goes something like this:

The magician sits across the table from you. He first asks the audience to observe complete silence. He then performs some mysterious rituals, and then lays down both his hands on the table, with both the palms open and facing upwards. He asks you to take out a coin from your pocket, have a close look at it so that you could identify it later on. He asks you to keep that coin on the open palm of one of his hands, say the right hand. He then slowly closes his eyes, and begins to chant a mantra. Soon his body begins shaking, and even as this is continuing, he turns his palms downwards, clenching his fists in the process. Continuing with chanting of mantras, he now slowly opens his eyes and begins to concentrate hard, alternately on either of his hands, as if commanding something invisible to move something. Even as you are watching intently, he slowly begins to fold his palms even as they are still kept touching the table, and thus begins to form clenched fists. The chanting of the mantra now proceeds even more rapidly, at an almost feverish a pace. The look on his face is deep concentration personified. This goes on for a while. Slowly, the pace of the mantra intonation slackens, and soon afterward, he ceases the chanting of the mantra altogether. He then relaxes. After a while, slowly, he opens his palms and holds them open in front of you. Your coin is now visible on the left hand side—not on the right hand side!

Magic!!

In disbelief, to prove him wrong, you take his palms in your hands and make sure there is no such a thing as a skin-colored glove or a mask. You check the sleeves of his shirt. You look under the table cloth. Why, you also go and look below the table. … All, in vain. Finally, you give up.

Your friends suggest some more possibilities. You are too tired to check them out. So, your friends now take charge. They too go through some of your own checks and them some more. No use. Eventually, you all give up. You are not willing to admit that the coin might have traveled using some supernatural means of transportation—or via some quantum-mechanical principles that can never be understood, as per your favorite physics professor. You are not willing to admit either of these. Or any such a similar thing. And yet, you have no solution either—no idea what’s going on here.

. . . . .

Undoubtedly, you would be aware of the Millennium Problems posed by the Clay Mathematics Institute. A lot has been written on these problems since they were announced at the turn of the millennium… Why, recently, the Institute even offered the very first Millennium Prize, for having successfully solved the Poincare conjecture problem, to a Russian mathematician called Gregory Perelman.[^]

What concerns us here is the as-yet-unsolved problem of P-vs-NP.

This is a problem from the computational complexity theory. A good introduction to this theoretical area might be had by visiting the Wikipedia page on “Computational Complexity Theory” [^], and the one on the  “Traveling Salesman Problem” [^].

The introductory material for the description of this problem, as given on the Web page of the Clay Institute itself, includes the following:

In fact, one of the outstanding problems in computer science is determining whether questions exist whose answer can be quickly checked, but which require an impossibly long time to solve by any direct procedure.

Most writers trying to explain this brand of mathematics have used a similar language, emphasizing the inherent asymmetry between solving a problem and checking or verifying the solution.

For example, consider the domain of encryption. Given an encrypted text, it is hard to find what the original text might be. You might try different combination of letters and symbols as a password, and use it in a trial decryption, but without success. The “brute force” method of trying out every possible password may take forever, simply because there are so many combinations possible. However, if you are given the password—i.e. the solution—then, it is very easy to verify that it works for decryption. That is to say, it is very easy to verify that the given solution indeed is a solution. Similarly, for the Traveling Salesman Problem. It is easy to verify that a given route has a cost that falls below a given specific value, and yet, it’s difficult to find the routes that would fulfill the given cost constraints.

The popular mathematics writers, therefore, often emphasize this asymmetry in their writing.

While this way of characterizing this kind of mathematics is as good as it gets, it can also give rise to some misconceptions.

For example, consider the magic trick given above. The problem is to determine and explain whether the coin at all goes from one hand to the other, and if yes, how. This problem is extremely difficult to solve for any lay person. … If it weren’t, there would be no element of mystery left in it, there won’t be any magic left in it! Yet, if a solution—an explanation of how the magic trick actually works—is given to you, it would be very easy to verify that that’s what is going on. Let me proceed to tell you just that.

. . . . .

The coin does physically go from the palm on the right to that on the left—there are no optical illusions involved here. However, the coin does not so physically shift as a result of all that mental concentration or the mantra-chanting.

As an aside: This does not mean that consciousness cannot move matter around. We all routinely do so for the matter in our brain when we are doing anything like thinking, lifting an arm, or even dreaming. … Hat tip to Dr. Harry Binswanger [^], for his lectures on the metaphysics of consciousness. [I knew it already, but he puts it far more clearly than I could have.]

Coming back to the magic. Certainly, the coin does not move because of the hard mental concentration of the magician. In fact, it does not even move after be begins to concentrate! The coin has already moved from one palm to another way before that! Read the whole sequence of the trick once again, in particular, this part appearing right at the beginning of the description:  “… He then slowly closes his eyes, and begins to chant a mantra. Soon his body begins shaking, and even as this is continuing, he turns his palms downwards, clenching his fists in the process. …” What the magician actually does is to flick the right hand palm a bit too quickly, and a moment too early as compared to the left palm, in the process of his turning the palms downward. In that flick, he is actually tossing the coin—right in front of your eyes—from the right palm to the left. Since the action is so quick—and done so smoothly—you don’t notice it. … The chanting of the mantra and the closing of his own eyes etc. are all diversions designed to attract your attention away from that flick. Another related matter: Any object that streaks past your eyes at a close distance is not registered too well. Try that game of rolling a cricket ball over your arm, and giving it a jerk with your elbow. Instead of you catching the ball, have a friend nearby, and ask him to catch the ball. It is tough. Not because the ball is moving fast, but because not just the direction of the flick but also its timing is difficult to anticipate. And this happens even if your friend knows in advance that you are going to flick the ball. … What happens when you don’t even know that flicking the coin right through the air above is what that magician is going to do!! It’s next to impossible to tell that the coin has been flicked through the air—if you don’t already know the solution.

. . . . .

So the magic trick seems to fit the description for the NP-category of problems. I mean the parallels seem to be obvious, at the first glance. For this magic trick, as for a typical NP-complete/hard problem, the solution is difficult to get at—there are just too many other possibilities that you would have to exhaust one by one. But the solution, if it is already known, is very easy to verify.

Does that mean that this magic trick is in NP?

Nope. Not at all.Not if you are careful in your choice of words.

The problem of P-vs-NP is a very specific problem of mathematics that has been defined in a very careful manner. Just because some characteristics of this problem also apply to magic does not mean that magic is in NP. Indeed, taken by itself, this statement does not even begin to make sense.

To be able to apply the theory NP problems to magic, one will have to first create an abstract model capturing the essentials of this particular trick. One will have to show that it can be modeled in terms of, e.g., the graph theory, as nodes connected via edges. Each of the different possibilities existing for putting the coin in the other palm would have to be shown as a possible route via certain abstract nodes and edges. Further, one will have to demonstrate that logically and quantitatively, this graph model of the magic trick satisfies all the essential characteristics displayed by the NP category of problem. It is only then that anyone would be able to do anything mathematically useful with this idea. Otherwise, it is just a surface analog and nothing more.

An aside: Unlike many, perhaps most, mathematicians (and a large number of physicists too), I believe that analogs and analogies by themselves are not bad—they do not represent inexact or sloppy thinking just because they are analogies. The trouble is, analogies often are poor in that they only scratch the surface and do not continue showing analogous relationships at even deeper, more fundamental, levels. Without analogies, you couldn’t gain insight about, say, electricity, starting from, say, ideal fluid flow. Analogies are useful, nay, even important. Provided that the analog continues to hold even in some fundamental way.

. . . . .

We have seen how you have to build an abstract model of a physical situation before it can be mathematized. There is a symmetrical counterpart to this procedure too.

Often times you find mathematics being admired for its “direct” application in physics. For example, chaos theory in fluid turbulence, or even more popularly, hypergeometry in the relativity theory.

I believe that when mathematicians say something like that, many times (though not always) they are actually engaging in a fallacy—what I call the fallacy of Hasty Application. (To the best of my knowledge, such a fallacy does not actually exist. I just made it up. It is meant to be a fallacy which is symmetrical to the existing fallacy of Hasty Generalization. Actually, a closely related fallacy of Hasty Concretization should also be possible.)

What mathematicians ought to be doing in instances like these—if they at all wish to concern themselves with applications (which they should!)—is to carefully spell out the structure and the meaning of the abstract physical model which they have assumed in so applying their favorite mathematical theory. [… It’s too much to ask of today’s mathematicians, I know…]

Abstract models do not always have to be mathematical, they can be also physical. Stronger: For proper integration, when it comes to applying mathematics to physics, you have no choice but to first find the corresponding abstract physical model(s). Today, both the physicists and the mathematicians have abdicated this responsibility. Yet, by the grace of reality—i.e. the nature of the human consciousness—explicitly identification of the abstract physical model is a necessity. It is a necessity for the proper acquisition, validation, or application of knowledge.

A physical theory cannot be “done” just by giving some equations, say together with their units. A theory is not another name for a “cheat-sheet.”… Ultimately, truth is not established in reference to mere mental constructs; correspondence has to be established to the perceptual facts of reality.

That is what is involved in applying a mathematical theory or idea to some aspect of the physical reality. Too many mathematicians—and mathematics-lovers—stumble on this part.

If you don’t spell out the sense in which you are using the term “light,” and “matter,” if you don’t specify what you are already assuming towards the nature of the interconnections between these  in your theory (or completely abstain from ever mentioning that anything of this sort is at all being taken for granted in such a BVP/IVP view of reality), if you don’t specify what the referents for your notion of space, time, or “spacetime” are, and then, without poring over any such a matter if you brazenly go ahead and say that your favorite theory of hypergeometry is important because it is “directly” applicable to Einstein’s relativity, you are merely confessing either your ignorance or your incompetence in the matters of what it takes for a piece of mathematics to be applicable to an aspect of reality. And worse. It takes no brains to tell that you are using hypergeometry precisely because it helps you forward your favorite nonsensical, bizzare, convoluted ideas of nothingness wrapping and turning within itself as being physically real.

Even if mathematicians were to simply provide the meaning of all the essential concepts in a theory, say hypergeometry and relativity, I would be happy. I wouldn’t ask them—though this part, too, is necessary—to begin with hypergeometry’s relation to ordinary geometry and then do the same for the physical theory and demonstrate how the contents of that physical theory actually neither contradict nor surprise the common-sense. I wouldn’t expect them to do this second part, primarily because if the physical meaning of the concepts is known, if the physical model is well identified in the first place, then almost anyone could do the second part. It is the first part which is difficult. And it the first part where mathematicians (and physicists, not to mention irrationalists) play all their magician’s tricks.

. . . . .

Though I don’t have the patience left to type out any more, think about this in the meanwhile:

Taking a typical example of the category of problems that are well-modeled as NP-complete/hard problems (e.g. , the Traveling Salesman problem), it is very easy to see that there indeed are a large number of alternative paths possible. These problems, as we have seen, satisfy the difficult-to-solve-but-easy-to-verify criterion.

Then, as we saw for the case of the magic trick, it too satisfies the same difficult-to-solve-but-easy-to-verify criterion. Yet, observe that, in this problem, the number of alternatives (or possible solutions) are very few.

Indeed, even though it is known that this problem is difficult to solve, the said difficulty does not very easily manifest as the greatness of the number of possibilities.

Further, even alternative paths are not easy to isolate or conceive of, given just the physical observation of the trick. Each formulation of the hypothesis for the solution, by itself requires a great degree of acuteness of observation, ability to form coherent and reasonable hypotheses… All in all, the scientific kind of creativity.

In contrast, in the mathematical NP problems, stating the alternatives available is as simple as mechanically enumerating them—it involves no creativity.

Observe that today’s mathematicians (and, unfortunately also today’s physicists, theoretical engineers, etc.) lionize the second but not the first.

…May be, more on some of these aspects, later…

– – – – –

Songs I Like [More or less at random]

1. (Hindi) “nayi manzil, nayi raahen…”
Music: Hemant Kumar
Singers: Lata Mangeshkar and Hemant Kumar
Lyrics: S. H. Bihari

2. (Hindi) “teraa meraa saath rahe…”
Music: Ravindra Jain
Singer: Lata Mangeshkar
Lyrics: Ravindra Jain (?)

– – – – –

PS: As usual, to be improved/streamlined…

Some loose-talk on the P and NP categories of problems

Do you know what the algorithmic complexity classes of P and NP mean? If not, then look up on the Internet (say, at Wikipedia). Better still, pick up Dennis Shasha’s book of the title: “Out of Their Minds” (It was a great holiday read about a decade back, and it remains so, even today.)

Given below is a very brief (and a very poorly written) outline of what the terms P and NP mean like.

Computer algorithms (and, by implication, the problems they solve) can be broadly classified into two categories: P, short for polynomial-time, and NP, short for non-deterministic polynomial-time. (Mathematicians do not know for sure anything about the classification relations between the two classes—whether they are mutually exclusive or not, whether they are co-extensive (or identical) or not, etc. … ) One way to approach understanding this classification scheme is the following.

When it comes to estimating computational costs of problems, two things help us characterize the situation: (i) the basic (or the “inherent” or the “intrinsic”) size of the computational problem itself, and (ii) the time required for solving that problem using a given algorithm. These two things are separate. Let us look at each one of them in some detail below.

Just to take an example, suppose your problem involves FEM (finite element method) as in computational solid mechanics, or FVM/FDM (finite volume or finite difference methods) as in CFD (computational fluid dynamics). FEM, FDM, FVM, etc. all ultimately translate into a matrix equation of the form [A]{x} = {b}. (Not so, directly, always, but certainly so for the static analysis case anyway.) Which means, they translate into a problem of simultaneous equations, really speaking. If there are N equations with N unknown variables, then the order of the square matrix [A] would also be N. If the mesh you use for your CFD modeling is made finer, the order of the [A] matrix, N, increases. Thus, one can say that N characterizes the basic size of the problem.

There is another consideration: Suppose we are given the particular values of the matrices [A] and {b}, and then are asked to solve the matrix equation [A] {x} = {b}. In short, we have to find what values the elements of the unknown column matrix {x} must possess so that the aforementioned matrix equation would be satisfied.

In finding the solution, we may use, for example, Gaussian elimination, or some other technique. All these techniques require certain multiplications and divisions to be carried out among the elements of [A] and {b}. Now, it is a well-known fact that the computer takes a much longer time (it consumes more number of chip-cycles) to do floating point multiplications and divisions, as compared to doing additions or subtractions. Additions and subtractions are extremely fast (even if done with double precision); multiplications and divisions (and powers and exponentials) are not. Hence, one important way to predict how long a particular solution algorithm would take (say, to solve the equation [A]{x} = {b}) is to ask: How many steps involving multiplication/division are carried out in executing that algorithm. You see, not all algorithms require equal number of multiplications/divisions. Thus, they differ in terms of how many simple steps each has.

It is useful to express the number of (multiplication/division) steps taken in terms of a measure called “Computational Complexity”, denoted as O(N). The capital letter “O” stands for the word “Order” (of complexity). In finding out the values of O(N) for different algorithms, we are not interested in the details of the exact number of steps; only in a rough estimate of the number of steps.

Now, the actual number of steps necessary to solve the problem scales as per the intrinsic size of the problem. Hence, it is useful to normalize the expression for the number of steps, with the size of the problem (N).

Thus, in general, we have an expression of the following form:

O(N) = f(N)

where f(N) is some function of N.

For instance, it is well known that the Gaussian elimination algorithm is numerically fairly stable (even if it is not as stable as so many people routinely suppose), and that its complexity scales as N^3. (The expression for its complexity also include certain lower-order terms like those involving N^2, but these not being dominant are ignored in getting to the *order* of the complexity.)

The aforementioned two factors (N and O(N)) together determine how long a given computational problem would take.

Now, it is easy to tell what the terms P and NP refer to. Well, kind of.

The whole idea is to examine the nature of the expression for f(N). Is it linear in N? That is, does it carry the form: f(N) = k N, where k is some algebraic constant which is independent of N? Even if not, is it at least a polynomial in N? Or is it a really bad expression involving something like e^N or N^N, etc?

If it is the former, then we call it an algorithm of polynomial complexity, shortly denoted as “P”.

On the other hand, if it involves some other relation like the exponential or the factorial relation, then, such a function increases far too rapidly with the increases in N, and so, the complexity (or the time to complete the computational task) becomes impracticably large.

[Here, it would be useful trying out a concrete example. Plot the graphs of N vs f(N) where f(N) is given by: (a) N, (b) e^N, (c) log N, (d) N log N, (e) N^2, (f) 2^N, and (g) N!, for N ranging from, say, 1 to 50. You will immediately appreciate the point that the range of numerical values for complexity can be very wide.]

Often times, it so happens that even if the algorithm to solve a particular class of problems is known, its corresponding f(N) expression is so bad that it is easy to see that the computer will never complete solving the problem using that particular algorithm. The predicted time for the algorithm to complete execution even on the fastest supercomputer can easily exceed billions of years of computer time.

So, in such cases, it’s a bit contradictory. We know the solution—but we don’t. To be more precise: We do know the algorithm, and yet, we also know that the only algorithm that is available cannot complete its run before the Sun collapses (which would happen after billions of years). (Interestingly, the algorithm can be proved to be correct—and yet, it cannot practically complete the run on a real computer.)

A few more points (though I know this blog-entry has by now already gotten into the “novel” mode.)

Sometimes, there are certain helpful mathematical relations embedded within the elements of these matrices—helpful, as judged from the viewpoint of getting solutions faster. These are the relations that appear out of the particular *numerical values* that the matrix elements happen to have—not just their relative *placements* in the matrices. For instance, if the matrix [A] is in the tridiagonal form, then it is possible to solve the equation [A]{x} = {b} in a time of the order of just N, not N^3. This algorithm is called TDMA (which is short for Tri-Diagonal Matrix Algorithm). (OK, it won’t be quite N; there would also be a constant factor appearing in front of N, but the point here is that the constant factor would be, presumably, relatively small, and hence, practically, not very relevant in estimating the order of the computational complexity.) So, if the matrix model is amenable to the TDM algorithm, it can execute very very very fast. (The computational time can come down from years to a few hours. Literally.)

Given this observation, naturally, people have spent some time in finding out if matrices can be “pre-condition”ed so that they can be put in a form that makes them amenable to a faster solution. (There is also another reason for preconditioning: they also try to precondition the numbers if doing so can ensure stability in the subsequent solution procedure.) The idea here is that the preconditioning operation itself would be computationally much lower-cost but that it would put the matrix in such a shape that a much faster running solution algorithm can then be applied to the preconditioned matrix as the second step. Thus, though preconditioning is an *additional* operation, overall, it can still save time.

Another important observation: Practically, it is often possible (or preferable) to settle for an approximate but fast method of solution, if the approximate answer is “close enough.” There is an important category of algorithms that do just this: They supply a series of solution each step of which gives an approximate solution; each step in the series is increasingly better refined. For instance, the multigrid and multipole algorithms exhibit precisely this nature. What determines if an approximate solution is “close enough”? Answer: The purpose behind carrying out the computation.

—–

Most of the above discussion was from the viewpoint of the computational engineer and  not of the computer scientist. Let me add a little bit from the second perspective before closing this piece.

The P and NP classes, as we saw above, are decided in reference to the question: Can you give a polynomial expression for the given algorithm or not. If you can, then the algorithm is in the P class; if not, then, roughly speaking, it would belong to the NP class. This is not at all exact, but it will do for our purposes here. The origin of the NP class rather lies in the automata and languages theory, not in numerical analysis. The two classes are not theoretically well-related to each other except from the viewpoint of the computational complexity angle alone. … Indeeed, whether they are related or not is a million dollar question, literally… The trouble is, no published proof tells whether or why a P-class algorithm can or cannot be formulated for that class of problems which is known as the NP class. For more information on the million dollar prize declared for this open problem, visit The Clay Mathematics Institute, http://www.claymath.org.)

Actually, the absence of a polynomial expression (so far) is just one of the interesting characteristics of NP. The other is: ease of solution verification. What the latter means is the following.

Let us take the traveling salesman problem as a prime example of the NP class of problems. The problem may be stated thus: Suppose a salesman has to visit each of, say, ten cities; suppose those cities are, say, Mumbai, Delhi, Calcutta, Chennai, Bangalore, Ahmedabad, Hyderabad, Pune, and Kanpur. He may visit the given cities in any order, for example, Mumbai first, then Chennai, then Ahmedabad, then Delhi … etc. Or, he may very well choose some other oder for scheduling his visits, say, Delhi first, then Bangalore, then Mumbai… etc. And, of course, there is fare to be paid for the travel in between any given pair of cities. The fare will in general differ from pair to pair. The question is: If the salesman has to cover all the cities and if he is to visit each city exactly once, then, what is the best possible order in which he should visit these cities so that his total traveling expenses are provably lowest possible. (You may also pose this problem in terms of the total distance traveled. The objective would be to minimize the total km for a tour that covers all the cities, visiting each city only once.)

In this example, N, the size of the problem, is 10.

An “easy” (or absolutely sure-shot) way to find the solution would be to make a table consisting of two columns. In the first column, we could make a list of each possible route (that goes through all the cities but visits each city only once). In the second column, we could enter the total fare/distance which would correspond to this particular choice of the route. Once such a table, enumerating all possible routes, is constructed, it would be an easy matter to pick up the route that has the least cost associated with it.

OK. So, what’s the point if the solution is so simple?

The point is this. Before you can decide which particular route would be the cheapest one, you first have to enumerate all the possible routes. And it is this  step—enumeration of the alternatives and finding out the cost for each possible option—which can easily become a computationally expensive operation.

For just 10 cities (N = 10), it is relatively easy to enumerate all possible routes, but that’s only because 10! is such a small number (comparatively speaking). But the factorial function grows very rapidly with N. (If you are a student of science or engineering, right away look up the Stirling approximation for N!) For instance, 5! is just 120, but 10! already is 3,628,800, and 20! is a mind-boggling 2,432,902,008,176,640,000 (i.e. approx. 2.4 billion billions). … When N approaches a value of thousands or tens of thousands (and in practice, we do run into far larger models), then enumeration of each possible alternative itself begins to go out of the reach of even the most powerful super-computer in the world (or all of them put together).

“OK, but why bother?” you may say. “Is it really practical? Does it really matter whether a salesman’s budget goes a little beyond what is absolutely optimal? After all, in the real world, salesmen do travel, and they do sell, and companies do make profits… Why do nit-picking about optimality?”

The answer to that question is: For an actually traveling salesman, minimizing of tour budget may not at all be an issue. Yet, this category of problems is important… If you build a mathematical model of certain practically important problems like choosing the best possible sequence of the forwarding stations for routing of a telephone call (or of Internet data traffic) then that mathematical model looks exactly like the traveling salesman problem. And, billions of dollars (literally) stand to be saved if you can find a method that will tell the optimum route in a fast manner (fast, as in real-time).

OK. So, the traveling salesman problem is practically important.

And, it falls into the NP category.

One funny thing with these NP problems is that if you already have a solution, it is very easy to verify that it indeed is a solution as compared to any other practically posed alternative. Verification (i.e. the one done by sampling) is very easy. But arriving at the solution itself is not. The latter takes too long, or is too pricey. After all, all that you have to do in verification is to pick up some other route arbitrarily, say at random, and verify that between the two given alternatives, the route which corresponds to the actual solution indeed has a lower cost associated with it. That is what verification is all about. So, verification is easy—provided the solution to be verified itself is known in the first place.

Another funny thing with these NP problems is the ease with which they can come up in applications. … Sometimes it looks as if the whole practical world is full of just NP type of problems, and that the P category is rather an exception than the norm. … Incidentally, this precisely is one reason why much creativity is needed even in such seemingly mundane tasks as relational database design. Anybody could establish relations among tables, and proceed to normalize them using certain set rules, but deciding how to abstract what kind of table from what real-world objects… Now, that part is difficult… It’s pretty much an art… And it will continue to be so, precisely because NP is so easy to run into… As another example, consider the problem of optimally populating the school time-table—assigning teachers and students to different class-rooms in different time-slots. This seems to be very simple problem, but it (I suppose) can be proved to be in NP. … The practical world, indeed, is full of NP situations.

All this was a background, really speaking, to get to know a bit about the P and NP classes of algorithms.

Next time, I will give a particularly attractive example of a, vaguely speaking, NP kind of situation. I have been planning to do so for quite some time by now. And the next time, the description won’t involve even this much of mathematics, I promise! (Though, it becomes very tough to write precisely. I began wanting to write a verbal but non-mathematical description; found that I could not; so, ended up making this piece a “loose-talk”. If you have some suggestions to make it less loose, esp. concerning the description of the NP class, then sure do drop a line… Top rankers of IIT Bombay (AIR 4 and all) are especially welcome to attempt writing (LOL!))