Hmmm… Slightly more than 3 weeks since I posted anything here. A couple of things happened in the meanwhile.
1. Wrapping up of writing QM scripts:
First, I wrapped up my simulations of QM. I had reached a stage (just in my mind, neither on paper nor on laptop) whereby the next thing to implement would have been: the simplest simulations using my new approach. … Ummm… I am jumping ahead of myself.
OK, to go back a bit. The way things happened, I had just about begun pursuing Data Science when this QM thingie (conference) suddenly came up. So, I had to abandon Data Science as is, and turn my attention full-time to QM. I wrote the abstract, sent it to the conference, and started jotting down some of the early points for the eventual paper. Frequent consultations with text-books was a part of it, and so was searching for any relevant research papers. Then, I also began doing simulations of the simplest textbook cases, just to see if I can find any simpler route from the standard / mainstream QM to my re-telling of the facts covered by it.
Then, as things turned out, my abstract for the conference paper got rejected. However, now that I had gotten a tempo for writing and running the simulations, I decided to complete at least those standard UG textbook cases before wrapping up this entire activity, and going back to Data Science. My last post was written when I was in the middle of this activity.
While thus pursuing the standard cases of textbook QM (see my last post), I also browsed a lot, thought a lot, and eventually found that simulations involving my approach shouldn’t take as long as a year, not even several months (as I had mentioned in my last post). What happened here was that during the aforementioned activity, I ended up figuring out a far simpler way that should still illustrate certain key ideas from my new approach.
So, the situation, say in the first week of December, was the following: (i) Because the proposed paper had been rejected, there was no urgency for me to continue working on the QM front. (ii) I had anyway found a simpler way to simulate my new approach, and the revised estimates were that even while working part-time, I should be able to finish the whole thing (the simulations and the paper) over just a few months’ period, say next year. (iii) At the same time, studies of Data Science had anyway been kept on the back-burner.
That’s how (and why) I came to wrap up all my activity on the QM front, first thing.
I then took a little break. I then turned back to Data Science.
2. Back to Data Science:
As far as learning Data Science goes, I knew from my past experience that books bearing titles such as: “Learn Artificial Intelligence in 3 Days,” or “Mastering Machine Learning in 24 Hours,” if available, would have been very deeply satisfying, even gratifying.
However, to my dismay, I found that no such titles exist. … Or, may be, such books are there, but someone at Google is deliberately suppressing the links to them. Whatever be the case, forget becoming a Guru in 24 hours (or even in 3 days), I found that no one was promising me that I could master even just one ML library (say TensorFlow, or at least scikit-learn) over even a longer period, say about week’s time or so.
Sure there were certain other books—you know, books which had blurbs and reader-reviews which were remarkably similar to what goes with those mastering-within-24-hours sort of books. However, these books had less appealing titles. I browsed through a few of these, and found that there simply was no way out; I would have to begin with Michael Nielsen’s book [^].
Which I did.
Come to think of it, the first time I had begun with Nielsen’s book was way back, in 2016. At that time, I had not gone beyond the first couple of sections of the first chapter or so. I certainly had not come to even going through the first code snippet that Nielsen gives, let alone running it, or trying any variations on it.
This time around, though, I decided to stick it out with this book. I had to. … What was the end result?
Well, unlike me, I didn’t take any jumps while going through this particular book. I began reading it in the given sequence, and then found that I could even continue with the same (i.e., reading in sequence)! I also made some furious underlines, margin-notes, end-notes, and all that. (That’s right. I was not reading this book online; I had first taken a printout.) I also sketched a few data structures in the margins, notably for the code around the “w” matrices. (I tend to suspect every one else’s data structures except for mine!) I pursued this activity covering about everything in the book, except for the last chapter. It was at this point that finally my patience broke down. I went back to my usual self and began jumping back and forth over the topics.
As a result, I can’t say that I have finished the book. But yes, I think I’ve got a fairly idea of what’s there in it.
3. What books to read after Nielsen’s?
Of course, Nielsen’s book wasn’t the only thing that I pursued over the past couple of weeks. I also very rapidly browsed through some other books, checked out the tutorial sites on libraries like scikit-learn, TensorFlow, etc. I came to figure out two things:
As the first thing, I found that I was unnecessarily getting tense when I saw young people casually toss around some fearsome words like “recurrent learning,” “convolutional networks,” “sentiments analysis,” etc., all with such ease and confidence. Not just on the ‘net but also in real life. … I came to see them do that when I attended a function for the final-rounds presentations at Intel’s national-level competition (which was held in IISER Pune, a couple of months ago or so). Since I had seen those quoted words (like “recurrent learning”) only while browsing through text-books or Wiki articles, I had actually come to feel a bit nervous at that event. Ditto, when I went through the Quora answers. Young people everywhere in the world seemed to have put in a lot of hard-work in studying Data Science. “When am I going to catch up with them, if ever?” I had thought.
It was only now, after going through the documentation and tutorials for these code libraries (like scikit-learn) that I came to realize that the most likely scenario here was that most of these kids were simply talking after trying out a few ready-made tutorials or so. … Why, one of the prize-winning (or at least, short-listed) presentations at that Intel competition was about the particles-swam optimization, and during their talk, the students had even shown a neat visualization of how this algorithm works when there are many local minima. I had got impressed a lot by that presentation. … Now I gathered that it was just a ready-made animated GIF lifted from KDNuggets or some other, similar, site… (Well, as it turns out, it must have been from the Wiki! [^])
As the second thing, I realized that for those topics which Nielsen doesn’t cover, good introductory books are hard to find. (That was a bit of an understatement. My real feel here is that, we are lucky that Nielsen’s book is at all available in the first place!)
…If you have any tips on a good book after Nielsen’s then please drop me an email or a comment; thanks in advance.
4. A tentative plan:
Anyway, as of now, a good plan seems to be: (i) first, to complete the first pass through Nielsen’t book (which should take just about a couple of days or so), and then, to begin pursuing all of the following, more or less completely simultaneously: (ii) locating and going through the best introductory books / tutorials on other topics in ML (like PCA, k-means, etc); (iii) running tutorials of ML libraries (like scikit-learn and TensorFlow); (iv) typing out LaTeX notes for Nielsen’s book (which would be useful eventually for such things as hyper-parameter tuning), and running modified (i.e., simplified) versions of his code (which means, the second pass through his book); and finally (v) begin cultivating some pet project from Data Science for moonlighting over a long period of time (just the way I have maintained a long-running interest in the micro-level water-resources engineering).
As to the topic for the pet project, here are the contenders as of today. I have not finalized anything just as yet (and am likely not to do so for quite some time), but the following seem to be attractive: (a) Predicting rainfall in India (though getting granular enough data is going to be a challenge), (b) Predicting earth-quakes (locations and/or intensities), (c) Identifying the Indian classical “raaga” of popular songs, etc. … I also have some other ideas but these are more in the nature of professional interests (especially, for application in engineering industries). … Once again, if you feel there is some neat idea that could be adopted for the pet project, then sure point it out to me. …
…Anyway, that’s about it! Time to sign off. Will come back next year—or if some code / notes get written before that, then even earlier, but no definite promises.
So, until then, happy Christmas, and happy new year!…
A song I like:
(Marathi) “mee maaze mohita…”
Lyrics: Sant Dnyaaneshwar
Music and Singer: Kishori Amonkar
One editing pass is still due; should be effected within a day or two. Done on 2018.12.18 13:41 hrs IST.]