Data Engineering is Not Software Engineering

Cargo-culting DevOps and Agile principles over to data pipeline development, in which incremental change and frequent deployment are encouraged, simply ignores the inertia of data.

A Pipeline Is Either Completed or Worthless

1. A partial dataset does not have proportional utility

2. Time to develop a pipeline is not correlated with dataset size

3. Time and economic cost to build a dataset correlates with its size

>>Food for thought<<

1. inertia of data

2. Engineers would like to “do it right the first time” and minimize the number of deployments to production. 

https://betterprogramming.pub/data-engineering-is-not-software-engineering-af81eb8d3949

https://www.jimmywu.me/why-data-engineering-is-not-software-engineering/?fbclid=IwAR2bFdVJN0fgpO_dzALy9HNfU-GRR83IvwjBjjornhGhtfMY9U6lZ9eIhRM