Data Science links—1

Oakay… My bookmarks library has grown too big. Time to move at least a few of them to a blog-post. Here they are. … The last one is not on Data Science, but it happens to be the most important one of them all!



On Bayes’ theorem:

Oscar Bonilla. “Visualizing Bayes’ theorem” [^].

Jayesh Thukarul. “Bayes’ Theorem explained” [^].

Victor Powell. “Conditional probability” [^].


Explanations with visualizations:

Victor Powell. “Explained Visually.” [^]

Christopher Olah. Many topics [^]. For instance, see “Calculus on computational graphs: backpropagation” [^].


Fooling the neural network:

Julia Evans. “How to trick a neural network into thinking a panda is a vulture” [^].

Andrej Karpathy. “Breaking linear classifiers on ImageNet” [^].

A. Nguyen, J. Yosinski, and J. Clune. “Deep neural networks are easily fooled: High confidence predictions for unrecognizable images” [^]

Melanie Mitchell. “Artificial Intelligence hits the barrier of meaning” [^]


The Most Important link!

Ijad Madisch. “Why I hire scientists, and why you should, too” [^]


A song I like:

(Western, pop) “Billie Jean”
Artist: Michael Jackson

[Back in the ’80s, this song used to get played in the restaurants from the Pune camp area, and also in the cinema halls like West-End, Rahul, Alka, etc. The camp area was so beautiful, back then—also uncrowded, and quiet.

This song would also come floating on the air, while sitting in the evening at the Quark cafe, situated in the middle of all the IITM hostels (next to skating rink). Some or the other guy would be playing it in a nearby hostel room on one of those stereo systems which would come with those 1 or 2 feet tall “hi-fi” speaker-boxes. Each box typically had three stacked speakers. A combination of a separately sitting sub-woofer with a few small other boxes or a soundbar, so ubiquitous today, had not been invented yet… Back then, Quark was a completely open-air cafe—a small patch of ground surrounded by small trees, and a tiny hexagonal hut, built in RCC, for serving snacks. There were no benches, even, at Quark. People would sit on those small concrete blocks (brought from the civil department where they would come for testing). Deer would be roaming very nearby around. A daring one or two could venture to come forward and eat pizza out of your (fully) extended hand!…

…Anyway, coming back to the song itself, I had completely forgotten it, but got reminded when @curiouswavefn mentioned it in one of his tweets recently. … When I read the tweet, I couldn’t make out that it was this song (apart from Bach’s variations) that he was referring to. I just idly checked out both of them, and then, while listening to it, I suddenly recognized this song. … You see, unlike so many other guys of e-schools of our times, I wouldn’t listen to a lot of Western pop-songs those days (and still don’t). Beatles, ABBA and a few other groups/singers, may be, also the Western instrumentals (a lot) and the Western classical music (some, but definitely). But somehow, I was never too much into the Western pop songs. … Another thing. The way these Western singers sing, it used to be very, very hard for me to figure out the lyrics back then—and the situation continues mostly the same way even today! So, recognizing a song by its name was simply out of the question….

… Anyway, do check out the links (even if some of them appear to be out of your reach on the first reading), and enjoy the song. … Take care, and bye for now…]

 

Advertisements

Flames not so old…

The same picture, but two American interpretations, both partly misleading (to varying degrees):

NASA releases a photo [^] on the FaceBook, on 24 August at 14:24, with this note:

The visualization above highlights NASA Earth satellite data showing aerosols on August 23, 2018. On that day, huge plumes of smoke drifted over North America and Africa, three different tropical cyclones churned in the Pacific Ocean, and large clouds of dust blew over deserts in Africa and Asia. The storms are visible within giant swirls of sea salt aerosol (blue), which winds loft into the air as part of sea spray. Black carbon particles (red) are among the particles emitted by fires; vehicle and factory emissions are another common source. Particles the model classified as dust are shown in purple. The visualization includes a layer of night light data collected by the day-night band of the Visible Infrared Imaging Radiometer Suite (VIIRS) on Suomi NPP that shows the locations of towns and cities.

[Emphasis in bold added by me.]

For your convenience, I reproduce the picture here:

Aerosol data by NASA

Aerosol data by NASA. Red means: Carbon emissions. Blue means: Sea Salt. Purple means: Dust particles.

Nicole Sharp blogs [^] about it at her blog FYFD, on Aug 29, 2018 10:00 am, with this description:

Aerosols, micron-sized particles suspended in the atmosphere, impact our weather and air quality. This visualization shows several varieties of aerosol as measured August 23rd, 2018 by satellite. The blue streaks are sea salt suspended in the air; the brightest highlights show three tropical cyclones in the Pacific. Purple marks dust. Strong winds across the Sahara Desert send large plumes of dust wafting eastward. Finally, the red areas show black carbon emissions. Raging wildfires across western North America are releasing large amounts of carbon, but vehicle and factory emissions are also significant sources. (Image credit: NASA; via Katherine G.)

[Again, emphasis in bold is mine.]

As of today, Sharp’s post has collected some 281 notes, and almost all of them have “liked” it.

I liked it too—except for the last half of the last sentence, viz., the idea that vehicle and factory emissions are significant sources (cf. NASA’s characterization):


My comment:

NASA commits an error of omission. Dr. Sharp compounds it with an error of commission. Let’s see how.

NASA does find it important to mention that the man-made sources of carbon are “common.” However, the statement is ambiguous, perhaps deliberately so. It curiously omits to mention that the quantity of such “common” sources is so small that there is no choice but to regard it as “not critical.” We may not be in a position to call the “common” part an error of commission. But not explaining that the man-made sources play negligible (even vanishingly small) role in Global Warming, is sure an error of omission on NASA’s part.

Dr. Sharp compounds it with an error of commission. She calls man-made sources “significant.”

If I were to have an SE/TE student, I would assign a simple Python script to do a histogram and/or compute the densities of red pixels and have them juxtaposed with areas of high urban population/factory density.


This post may change in future:

BTW, I am only too well aware of the ugly political wars being waged by a lot of people in this area (of Global Warming). Since I do appreciate Dr. Sharp’s blog, I would be willing to delete all references to her writing from this post.

However, I am going to keep NASA’s description and the photo intact. It serves as a good example of how a good visualization can help in properly apprehending big data.

In case I delete references to Sharp’s blog, I will simply add another passage on my own, bringing out how man-made emissions are not the real cause for concern.

But in any case, I would refuse to be drawn into those ugly political wars surrounding the issue of Global Warming. I have neither the interest nor the bandwidth to get into it, and further, I find (though can’t off-hand quote) that several good modelers/scientists have come to offer very good, detailed, and comprehensive perspectives that justify my position (mentioned in the preceding paragraph). [Off-hand, I very vaguely remember an academic, a lady, perhaps from the state of Georgia in the US?]


The value of pictures:

One final point.

But, regardless of it all (related to Global Warming and its politics), this picture does serve to highlight a very important point: the undeniable strength of a good visualization.

Yes I do find that, in a proper context, a picture is worth a thousand words. The obvious validity of this conclusion is not affected by Aristotle’s erroneous epistemology, in particular, his wrong assertion that man thinks in terms of “images.” No, he does not.

So, sure, a picture is not an argument, as Peikoff argued in the late 90s (without using pictures, I believe). If Peikoff’s statement is taken in its context, you would agree with it, too.

But for a great variety of useful contexts, as the one above, I do think that a picture is worth a thousand words. Without such being the case, a post like this wouldn’t have been possible.


A Song I Like:
(Hindi) “dil sajan jalataa hai…”
Singer: Asha Bhosale
Music: R. D. Burman [actually, Bertha Egnos [^]]
Lyrics: Anand Bakshi


Copying it right:

“itwofs” very helpfully informs us [^] that this song was:

Inspired in the true sense, by the track, ‘Korbosha (Down by the river) from the South African stage musical, Ipi Ntombi (1974).”

However, unfortunately, he does not give the name of the original composer. It is: Bertha Egnos (apparently, a white woman from South Africa [^]).

“itwofs” further opines that:

Its the mere few initial bars that seem to have sparked Pancham create the totally awesome track [snip]. The actual tunes are completely different and as original as Pancham can get.

I disagree.

Listen to Korbosha and to this song, once again. You will sure find that it is far more than “mere few initial bars.” On the contrary, except for a minor twist here or there (and that too only in some parts of the “antaraa”/stanza), Burman’s song is almost completely lifted from Egnos’s, as far as the tune goes. And the tune is one of the most basic—and crucial—elements of a song, perhaps the most crucial one.

However, what Burman does here is to “customize” this song to “suit the Indian road conditions tastes.” This task also can be demanding; doing it right takes a very skillful and sensitive composer, and R. D. certainly shows his talents in this regard, too, here. Further, Asha not only makes it “totally, like, totally” Indian, she also adds a personal chutzpah. The combination of Egnos, RD and Asha is awesome.

If the Indian reader’s “pride” got hurt: For a reverse situation of “phoreenn” people customizing our songs, go see how well Paul Mauriat does it.

One final word: The video here is not recommended. It looks (and is!) too gaudy. So, even if you download a YouTube video, I recommend that you search for good Open Source tools and use it to extract just the audio track from this video. … If you are not well conversant with the music software, then Audacity would confuse you. However, as far as just converting MP4 to MP3 is concerned, VLC works just as great; use the menu: Media \ Convert/Save. This menu command works independently of the song playing in the “main” VLC window.


Bye for now… Some editing could be done later on.

An Idea on Visualization of Cultural Contexts

I have had an idea on visualization of cultural contexts, for quite some time now (for years, actually).

The idea is something like this: Using software, plot geographical maps, say, of nations, and show their evolution in space and time as a dynamically evolving series of pictures (perhaps suitably faded in/out). … In short, animations or movies depicting nations… The idea can be extened to many cultural contexts as well…

For a simple implementation, think of a world map in a 2D window and a slider for the time variable. The slider can be moved up or down manually, or it could progress on its own once put in the auto-mode. Maps of all kingdoms or nations existing at a given time appear on the world map. (With software, you can easily zoom in/out and handle a lot of data simultaneously.) As the time-slider moves, geographical extents of the nations also move, creating a movie of sorts in the process. The movie visually shows how the kingdoms or states originated, expanded, contracted, or got absorbed in other kingdoms, empires, etc.

In short, something like the evolution of soap bubbles (dynamically collapsing, growing, changing shapes, etc.), but on a geographic and historic scale.

In case the description appears too impersonal, then think of providing links from cities or locations on the maps to important historical personalities—their pictures, movies, speeches, ideas, etc.

What I said above for political maps can also be implemented for maps of regions of influence of cultures and ideas. … How different cultures spread from their points of origin outwards… For instance, think of the Greek culture, the Roman culture, the Indian culture, etc.

You can also have evolving history of the spreading of languages.  Also, of technical know-how, gadgets and devices in daily use, mathematical ideas, calender systems, art, literature, games of sports, coins, units of measure in daily use, myths, religions, philosophic ideas, political systems… Almost anything big and small characterizing people, societies and cultures. … Why, even manners of dress and of culinary practices!!

Also, movements of people (individuals or groups) could also be superimposed on these evolving maps. (Good print illustrations, say in coffee-table books, do show such things, say using thick arrows. The point here is, all such things can now be dynamic.)

You can have a globe-based (i.e. 3D) visualization for the geographical base too. (Thereby removing all argumentation related to distortions in maps etc.)

The whole thing could be provided as a Web-based service. If so, it could link good informative Web sites together.

The evolution of political landscapes (just as an example) could be provided as standalone animated GIFs (or Flash movies, etc.) too.

—–

Another feature: On such a map, it should be possible to choose a particular location (say, an ancient city like Rome, or Varanasi, or Cairo), and, upon right-clicking on that location, the software will pop up a window giving a detailed scyscraper kind of visualization of the “layers of history” of that particular place, rather like geological layers in appearance, each layer being depicted using a different color. Each layer would correspond to a certain time period at that geographical (spatial) point.

This will help one realize the “depth of culture” of the place, so to speak. (That is, depth, as measured by time alone, which is not a very good way to measure the “depth” of an idea… Things related to conceptual linkages like extent of abstraction, scope of referents subsumed… these are the real issues in gauging the depth of an idea..)

Not all ideas are equal and not all cultural influences are equally strong. Some are stronger of influence than others. Some periods are brief but leave a far more lasting legacy than others. (Just think of the century-odd period of Socrates-Plato-Aristotle.) Some influences persist over short areas but for too long a period. (Consider the worship traditionally done in a North Indian village, of Ravana—not Rama.)

All such things could be brought out by directly depicting not just the layers of history but also the processes of diffusion (of languages, religions, arts, food, cultural influences, ideas, etc.) over time and space.

The software will prove to be far more instructive than merely timelines.

—–

I kept the idea to myself because I thought someone or the other surely must have thought of it. (Esp. after I saw the timeline and other multimedia features in Encyclopaedia Britannica, in its 2000 CD edition.) However, apparently none has.

I don’t subscribe to the altruist-collectivist philosophies that often ride on movements like “open source,” “wiki-this” and “wiki-that,” “anti-patents-movements,” etc. I think ideas like these (altruist-collectivist) are destructive of all culture.

However, I also am not always interested in taking out a patent on every patentable idea that occurs to me. That’s how I am sharing this.

—–

Written (in an email) on July 23, 2008. Revised and published here on August 10, 2008. [And yes, I sure will post my thoughts on water availability in India real soon, as promised a couple of posts earlier or so…]