View profile

Orchestration with Dagster. End-to-End Data Scientists. Snowflake's S-1. Experimentation @ Stitch Fix. [DSR #233]

❤️ Want to support this project? Forward this email to three friends! 🚀 Forwarded this from a friend?
Orchestration with Dagster. End-to-End Data Scientists. Snowflake's S-1. Experimentation @ Stitch Fix. [DSR #233]
By Tristan Handy • Issue #233 • View online
❤️ Want to support this project? Forward this email to three friends!
🚀 Forwarded this from a friend? Sign up to the Data Science Roundup here.

This week's best data science articles
Eugene Yan
Unpopular view: Data scientists should be more end-to-end.

While this is frowned upon (too generalist!), I've seen it lead to more context, faster iteration, greater innovation—more value, faster.

More details and Stitch Fix & Netflix's experience 👇

https://t.co/aOBjuBSsSz
The above-linked post is fantastic. Here’s the authors intro, strongly recommended if this resonates with you:
Recently, I came across a Reddit thread on the different roles in data science and machine learning, such as data scientist, decision scientist, product data scientist, data engineer, machine learning engineer, machine learning tooling engineer, AI architect, etc.
I found this worrying. It’s difficult to be effective when the data science process (problem framing, data engineering, ML, deployment/maintenance) is split across different people. It leads to coordination overhead, diffusion of responsibility, and lack of a big picture view.
IMHO, I believe data scientists can be more effective by being end-to-end. Here, I’ll discuss the benefits and counter-argumentshow to become end-to-end, and the experiences of Stitch Fix and Netflix.
Highly valued software startup Snowflake files for IPO
New Tools I'm Watching
I’m always monitoring the data tooling landscape, and I figured I’d start sharing some of the more interesting products I come across.
Noria: “Noria is a new streaming data-flow system designed to act as a fast storage backend for read-heavy web applications(…). It acts like a database, but precomputes and caches relational query results so that reads are blazingly fast.”
Soda: Another product in the quickly-developing data quality space. I don’t have a ton of insight into Soda specifically but this is a space that is moving quickly and is very interesting to me.
Thanks to our sponsors!
dbt: Your Entire Analytics Engineering Workflow
Did you enjoy this issue?
Tristan Handy

The internet's most useful data science articles. Curated with ❤️ by Tristan Handy.

If you don't want these updates anymore, please unsubscribe here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Powered by Revue
915 Spring Garden St., Suite 500, Philadelphia, PA 19123