Data Science Roundup #54: DIY Hedge Fund! Plus, estimating delivery times, learning loops, and more!

Two awesome walkthroughs, a data science leader's thoughts on productionizing data science within his
Data Science Roundup #54: DIY Hedge Fund! Plus, estimating delivery times, learning loops, and more!
By The Data Science Roundup • Issue #54
Two awesome walkthroughs, a data science leader’s thoughts on productionizing data science within his org, learning loops, GPUs & autonomous vehicles, and 26 research papers on deep learning applications. Enjoy 😂 😂 😂

This week's best data science articles
Ever wanted to get your hands dirty with stock market data? This post, and its followup, go deep into a tutorial on using Python to analyze stock market data, including grabbing data from Yahoo! Finance using pandas, visualization, moving averages, developing a moving-average crossover strategy, backtesting, and benchmarking. Time to build a model and enter it into Numerai.
The team at Postmates recently built a model to predict delivery times. This article walks through, beginning to end, the problem definition to the impressive result. Perhaps the most interesting aspect of this process: they settled on using a linear regression model rather than a more involved approach. Sometimes simple is better.
Zalando, one of Europe’s biggest fashion retailers, has spent a lot of time thinking about the way its data science team works with its engineering team. Their recommended approach: data scientists build microservices for their models that engineers then hit. Also, Zalando now calls their data scientists “research engineers”. I LOVE this.
Production data science @ Zalando
In a “learning loop”, everyone in the network programmatically benefits from the experience of everyone else in the network. Waze is a learning loop, producing better recommendations with each participant in the network. Building learning loops into products is at the core of product management in a machine learning world, and this post is the primer you need to read.
Hardware is taking center stage in deep learning applications, as evidenced by Nvidia’s recent foray into autonomous driving. In my opinion, the most impressive aspect of this feat: the CNN was trained with an extremely small dataset (20 example trips). In related Nvidia news: the company just announced its screaming fast system-on-a-chip built specifically for autonomous driving. Wow.
This post shares 26 different research papers, all published within the past few months, and all of which demonstrate a novel and compelling application of deep learning. Examples:
  • Advanced Melanoma Screening and Detection
  • Neural Networks for Brain Cancer Detection
  • Machine Learning for Ultrasound Images, Pre-Natal Care
…and 23 more. This is a great overview post if you want to wrap your brain around what a massive impact deep learning is going to have in the coming years.
Data viz of the week
This map is extremely interactive--click through for more.
Thanks to our sponsors!
Fishtown Analytics is a boutique analytics consultancy serving high-growth, venture-funded startups. Have analytics questions? Let’s chat.
Developers shouldn’t have to write ETL scripts. Consolidate your data in minutes. No API maintenance, scripting, cron jobs, or JSON wrangling required.
Did you enjoy this issue?
The Data Science Roundup
The internet's most useful data science articles. Curated with ❤️ by Tristan Handy.
Carefully curated by The Data Science Roundup with Revue. If you were forwarded this newsletter and you like it, you can subscribe here. If you don't want these updates anymore, please unsubscribe here.