Summary
I really thought I was onto something big: add a couple of simple domain rules to the loss function, and watch fraud detection just skyrocket on super-imbalanced data. The first run looked amazing… until I fixed a sneaky threshold bug and ran the whole thing across five different random seeds. Suddenly the “huge win” mostly evaporated. What I ended up with instead was honestly way more useful: a reminder that on rare-event problems like fraud, the way we measure success (thresholds, seeds, me...