Data Strategie

Data pipeline faalt om 2 uur 's nachts, hoe monitor je dit?

Reddit r/dataengineering

Samenvatting

Got paged because revenue dashboard showed garbage numbers, turns out some upstream source stopped sending data fresh but by the time my dbt models failed the whole chain was toast. Spent 3 hours sshing into everything guessing which table was bad. no lineage, no alerts on sources, just logs everywhere. wish i'd locked down source monitors like that platform team did with base images, backlog woulda dropped. but for pipelines, how do people catch ingestion crap before it hits transforms, cent...

Lees het volledige artikel