Data Strategie

data pipeline blew up at 2am and i have no clue where it started, how do you actually monitor this shit?

Reddit r/dataengineering

Summary

Got paged because revenue dashboard showed garbage numbers, turns out some upstream source stopped sending data fresh but by the time my dbt models failed the whole chain was toast. Spent 3 hours sshing into everything guessing which table was bad. no lineage, no alerts on sources, just logs everywhere. wish i'd locked down source monitors like that platform team did with base images, backlog woulda dropped. but for pipelines, how do people catch ingestion crap before it hits transforms, cent...

Read the full article