Most ML pipelines leak. Most teams don't know.
Published papers retracted or undermined by data leakage. Across 17 scientific fields. Not typos. Structural errors that existing tools made easy to commit.
Kapoor & Narayanan, Leakage and the reproducibility crisis in machine-learning-based science, Patterns, 2023.
Models trained on leaked data report inflated metrics. Teams ship products based on numbers that don't replicate. Decisions get made on evidence that was never real. The tools could have warned you. They didn't. That was the problem.
Paper
A Grammar of Machine Learning Workflows
Seven functions decompose supervised learning into a typed workflow that rejects data leakage at call time. Three pre-registered predictions, two confirmed, one falsified — the falsification is published too.
Read the paperFoundations
Biased Machines in the Realm of Politics
Doctoral dissertation. ML methodology applied to political science, where the leakage problem was first observed in practice.
Roth, S. (2022). KOPS Archive
What's next
Decision science
Once you have an honest prediction, how do you make an honest decision? When does a model output justify action?
Most deployed ML systems stop at prediction. Arguably the hard part is the step after: thresholds, costs, fairness constraints, human override. I'm working on it.
Open Questions
What I don't know yet
Honest research starts with what's unknown. These are the questions driving my current work.
evaluate() fifty times, picking the best number, and then calling assess()? The structure prevents accidental leakage. Can it prevent intentional gaming?How I work
EPAGOGY METHOD
επαγωγή — Aristotle's term for reasoning from particular observations to general principles. Prior Analytics, II.23.