Robert Kosara writes a brilliant post every year on the advances in data visualization over the prior year. This year’s post focuses on the visualization of uncertainty (a hot topic with the US elections), sketching & personal data, and trends in storytelling.
I’m amazed at how quickly the field of data visualization is moving. This gets surprisingly little attention amidst the constant barrage of ML milestones.
I’ve been working with data since I was in high school, but had never been excited about that fact until 2006. In the early days of video podcasts, I was a subscriber to TED Talks and watched Hans Rosling give his famous talk. It was like nothing I’ve ever seen. From the article:
That TED video has been watched over 11 million times. Eleven. million. times. A video of a geeky old professor talking about public health numbers!
I remember being blown away—Rosling had an amazing talent for both building brilliant visualizations and telling engaging stories with them. If you’ve never seen the video, take a moment to watch it. The article is an excellent summary of Rosling’s impact.
Flipboard, a popular news reading application, just released a “related stories” feature. From the article:
Although there are many sophisticated automatic clustering algorithms, story clustering is a non-trivial problem. Because each text document can contain any word from our vocabulary, most text document representations are extremely high-dimensional. In high-dimensional spaces, even basic clustering or similarity measures fail or are very slow.
This post on their engineering blog goes deep into the details of their implementation. Extremely useful.
Are you coming from an graduate school background and looking to get into data science? This article was written for you. Many of the skill sets used in PhD programs are incredibly relevant in data science jobs, but adjusting to a different context can be challenging.
This piece by Andrew Gelman starts off by roasting a researcher who seems to commit about every possible statistical / scientific sin. But it gets better:
I continue writing about this story because of the insight it gives into the inner workings of the famous self-correcting nature of science. The process of self correction is much more involved than people seem to realize. Sometimes people demand retractions, but as I’ve written before, I don’t see retraction as a serious solution for reform of poor research and publication practices, or as a way of cleaning the public record. The numbers just don’t add up: there are just too many hopelessly flawed papers, and retraction is done so rarely.
I am deeply interested in the social process of determining what the truth is. In both society and in science today it seems like we’re having some fundamental challenges agreeing on exactly how this should work.