Summary
Tired of always using the Titanic or house price prediction datasets to demo your use cases? I've just released a Python package that helps you generate realistic messy data that actually simulates reality. The data can include missing values, duplicate records, anomalies, invalid categories, etc. You can even set up a cron job to generate data programmatically every day so you can mimic a real data pipeline. It also ships with a Claude SKILL so your agents know how to work with the library a...