Yeah! Just that!


Update on 2020.02.17 16:02 IST:

The above is a snap I took yesterday at the Bhau Institute [^]’s event: “Pune Startup Fest” [^].

The reason I found myself laughing out loud was this: Yesterday, some of the distinguished panelists made one thing very clear: The valuation for the same product is greater in the S.F. Bay Area than in Pune, because the eco-system there is much more mature, with the investors there having seen many more exits—whether successful or otherwise.


When I was in the USA (which was in the 1990s), they would always say that not every one has to rush there to the USA, especially to the S.F. Bay Area, because technology works the same way everywhere, and hence, people should rather be going back to India. The “they” of course included the Indians already established there.

In short, their never-stated argument was this much: You can make as much money by working from India as from the SF Bay Area. (Examples of the “big three” of Indian IT Industry would often be cited, esp. of Narayana Moorthy’s.) So, “why flock in here”?

Looks like, even if they took some 2–3 decades to do so, finally, something better seems to have downed on them. They seem to have gotten to the truth, which is: Market valuations for the same product are much greater in the SF Bay Area than elsewhere!

So, this all was in the background, in the context.

Then, I was musing about their rate of learning last night, and that’s when I wrote this post! Hence the title.

But of course, not every thing was laughable about, or in, the event.

I particularly liked Vatsal Kanakiya’s enthusiasm (the second guy from the right in the above photo, his LinkedIn profile is here [^]). I appreciated his ability to keep on highlighting what they (their firm) are doing, despite a somewhat cocky (if not outright dismissive) way in which his points were being seen, at least initially. Students attending the event might have found his enthusiasm more in line with theirs, especially after he not only mentioned Guy Kawasaki’s 10-20-30 rule [^], but also cited a statistics from their own office to support it: 1892 proposals last month (if I got that figure right). … Even if he was very young, it was this point which finally made it impossible, for many in that hall, to be too dismissive of him. (BTW, he is from Mumbai, not Pune. (Yes, COEP is in Pune.))


A song I like:

(Hindi) ये मेरे अंधेरे उजाले ना होते (“ye mere andhere ujaale naa hote”)
Music: Salil Chowdhury
Singers: Talat Mahmood, Lata Mangeshkar
Lyrics: Rajinder Kishen

[Buildings made from the granite stone [I studied geology in my SE i.e. second year of engineering] have a way of reminding you of a few songs. Drama! Contrast!! Life!!! Money!!!! Success!!!!! Competition Success Review!!!!!!  Governments!!!!!!! *Business*men!!!!!!!!]



Equations in the matrix form for implementing simple artificial neural networks

(Marathi) हुश्श्… [translit.: “hushsh…”, equivalent word prevalent among the English-speaking peoples: “phewww…”]

I’ve completed the first cut in writing a document of the same title as that of this post. I wrote it in LaTeX. (Too many equations!)

I’ve just uploaded the PDF file at my GitHub account, here [^]. Remember, it’s still only in the alpha stage. (A beta release will follow after a few days. The final release may take place after a couple of weeks or so.)

Below the fold, I copy-paste the abstract and the preface of this document.

“Equations in the matrix form for implementing simple artificial neural networks”


This document presents the basic equations in reference to which artificial neural networks are designed and implemented. The scope is restricted to
the simpler feed-forward networks, including those having hidden layers. Convolutional and recurrent networks are out of the scope.

Equations are often initially noted using an index-based notation for the typical element. However, all the equations are eventually cast in the direct
matrix form, using a consistent set of notation. Some of the minor aspects of notation were invented to make the presentation as simple and direct as

The presentation here regards a layer as the basic unit. The term “layer” is understood in the same sense in which APIs of modern libraries like
TensorFlow-Keras 2.x take it. The presentation here is detailed enough that neural networks with hidden layers could be implemented, starting from
the scratch.


Raison d’être:

I wrote this document mainly for myself, to straighten out the different notations and formulae used in different sources and contexts.

In particular, I wanted to have a document that better matches the design themes used in today’s libraries (like TensorFlow-Keras 2.x) than the description in the text-books.

For instance, in many sources, the input layer is presented as consisting of both a fully connected layer and its corresponding activation layer. However, for flexibility, libraries like TF-Keras 2.x treat them as separate layers.

Also, some sources uniformly treat the input of any layer as \vec{X}, and output of any layer as activation, \vec{a} , but such usage overloads the term “activation”. Confusions also creep in because different conventions exist: treating the bias by expanding the input vector with 1 and the weights matrix with w_0 ; the “to–from” vs “from–to” convention for the weights matrix, etc.

I wanted to have a consistent notation that dealt with all such issues with a uniform, matrix-based notation that came as close to the numpy ndarray interface as possible.

Level of coverage:

The scope here is restricted to the simplest ANNs, including the simplest DL networks. Convolutional neural networks and recurrent neural networks are out of the scope.

Yet, this document wouldn’t make for a good tutorial for a complete beginner; it is likely to confuse him more than explaining anything to him. So, if you are completely new to ANNs, it is advisable to go through sources like Nielsen’s online book [^] to learn the theory of ANNs. Mazur’s fully worked out example of the back-propagation algorithm [^] should also prove to be very helpful,  before returning back to this document.

If you already know ANNs, and don’t want to see equations in the fully expanded forms—or, plain dislike the notation used here—then a good reference, roughly at the same level as this document, is the set of write-ups/notes by Mallya [^].


Any feedback, especially that regarding errors, typos, inconsistencies in notation, suggestions for improvements, etc., will be thankfully received.

How to cite this document:

TBD at the time of the final release version.

Further personal notings:

I began writing this document on 24 January 2020. By 30 January 2020, I had some 11 pages done up, which I released via the last post.

Unfortunately, it was too tentative, with lot of errors, misleading or inconsistent notation, etc. So, I deleted it immediately within a day. No point in having premature documents floating around in the cyberspace.

I had mentioned, right in the last post here on this blog (on 30 January 2020), that the post itself also would be gone. I will keep it for a while, and then, may be after a week or two, delete it.

Anyway, by the time I finished the alpha version today, the document had grown from the initial 11 pages to some 38 pages!

Typing out all the braces, square brackets, parentheses, subscripts for indices, subscripts for sizes of vectors and matrices… It all was tedious. … Somehow, I managed to finish it. (Will think twice before undertaking a similar project, but am already tempted to write a document each on CNNs and RNNs, too!)

Anyway, let me take a break for a while.

If interested in ANNs, please go through the document and let me have your feedback. Thanks in advance, take care, and bye for now.

A song I like:

[Just listen to Lata here! … Not that others don’t get up to the best possible levels, but still, Lata here is, to put it simply, heavenly! [BTW, the song is from 1953.]]

(Hindi) जाने न नजर पहचाने जिगर (“jaane naa najar pahechane jigar”)
Singers: Lata and Mukesh
Music: Shankar-Jaikishen
Lyrics: Hasrat Jaipuri


Equations using the matrix notation for a simple artificial neural network

Update on 2020.01.30 16:58 IST:

I have taken the document offline. Yes, it was too incomplete, tentative, and in fact also had errors. The biggest and most obvious error was about the error vector. 🙂

No, no one pointed any of the errors or flaws to me.

Yes, I will post an expanded and revised version later, hopefully in the first week of February. (I started work on this document on 24th Jan., but also was looking into other issues.) When I am done, I will delete this entire post, and make a new entry to announce the availability of the corrected, expanded and revised document.

The original post appears below.

Go, read [^].

Let me know about the typo’s. Also, errors. [Though I don’t expect you to do that. [I will eat my estimate of your moral character, on this count, at least for the time being.]]

I can, and might, take it out of the published domain, any time I want. [Yes, I am irresponsible, careless, unreliable, etc. [Also, “imposing” type. Without “team-spirit”. One who looks down on his colleagues as being way, way, inferior to me.]]

I will also improve on it—I mean the document. In fact, I even intend to expand it, with some brief notes to be added on various activation- and loss-functions.

Eventually, I may even publish it at GitHub. Or, at arXiv. [If they let me do that. But then, another consideration: there are physicists there!]


A song I like:

(Hindi) “कभी तो मिलेगी” (kabhee to milegi)
Singer: Lata
Music: Roshan
Lyrics: Majrooh Sultanpuri

[Credits happily listed in a mostly random order. [The issue was only with the last two; it was clear which one had to appear first.]