επαγωγή · Independent Research · Stuttgart
A grammar of
machine learning.
The tools could have warned you. They didn't.
This one does. Implemented in Rust, shipping in Python, R & Julia.
pip install mlw
import ml
data = ml.datasets.titanic()
s = ml.split(data, "survived", seed=42)
model = ml.fit(s.train, "survived", seed=42)
ml.evaluate(model, s.valid) # → AUC: 0.847
ml.assess(model, test=s.test) # → AUC: 0.831 (once) remotes::install_github("epagogy/ml")
library(ml)
data <- ml_datasets_titanic()
s <- ml_split(data, "survived", seed = 42)
model <- ml_fit(s$train, "survived", seed = 42)
ml_evaluate(model, s$valid) # → AUC: 0.847
ml_assess(model, test=s$test) # → AUC: 0.831 (once) ] add https://github.com/epagogy/ml
using ML
data = ML.datasets.titanic()
s = ML.split(data, "survived", seed=42)
model = ML.fit(s.train, "survived", seed=42)
ML.evaluate(model, s.valid) # → AUC: 0.847
ML.assess(model, test=s.test) # → AUC: 0.831 (once) One workflow. Seven functions.
Each function does one thing. Their order is enforced.
Use them wrong, and the API tells you.
split()
Train / validation / test. Temporal-aware.
prepare()
Fit on train, apply to all. Per fold.
fit()
Train a model. Any algorithm.
predict()
Generate outputs from a fitted model.
evaluate()
Measure on validation. Repeatable.
assess()
Measure on test. Once. The honest number.
explain()
Feature importance, partial dependence.
Four guarantees
Test data stays locked until you explicitly commit to it Preprocessing fits on train, never on validation or test Final evaluation runs once, no cherry-picking Wrong inputs are caught immediately
When to stop using ml: when your framework of choice enforces all four constraints natively.
From research, not from marketing.
Built on a decade of independent research into data leakage, causal inference, and ML methodology. Published, falsifiable, and open to critique.
Releases only.
Major versions and research updates. No noise.