επαγωγή · Independent Research · Stuttgart

A grammar of
machine learning.

The tools could have warned you. They didn't.

This one does. Implemented in Rust, shipping in Python, R & Julia.

python
pip install mlw

import ml
data = ml.datasets.titanic()
s = ml.split(data, "survived", seed=42)
model = ml.fit(s.train, "survived", seed=42)
ml.evaluate(model, s.valid)    # → AUC: 0.847
ml.assess(model, test=s.test)  # → AUC: 0.831  (once)
r
remotes::install_github("epagogy/ml")

library(ml)
data <- ml_datasets_titanic()
s <- ml_split(data, "survived", seed = 42)
model <- ml_fit(s$train, "survived", seed = 42)
ml_evaluate(model, s$valid)    # → AUC: 0.847
ml_assess(model, test=s$test)  # → AUC: 0.831  (once)
julia
] add https://github.com/epagogy/ml

using ML
data = ML.datasets.titanic()
s = ML.split(data, "survived", seed=42)
model = ML.fit(s.train, "survived", seed=42)
ML.evaluate(model, s.valid)    # → AUC: 0.847
ML.assess(model, test=s.test)  # → AUC: 0.831  (once)

One workflow. Seven functions.

Each function does one thing. Their order is enforced.
Use them wrong, and the API tells you.

split()
Train / validation / test. Temporal-aware.
prepare()
Fit on train, apply to all. Per fold.
fit()
Train a model. Any algorithm.
predict()
Generate outputs from a fitted model.
evaluate()
Measure on validation. Repeatable.
assess()
Measure on test. Once. The honest number.
explain()
Feature importance, partial dependence.

Four guarantees

Test data stays locked until you explicitly commit to it Preprocessing fits on train, never on validation or test Final evaluation runs once, no cherry-picking Wrong inputs are caught immediately

When to stop using ml: when your framework of choice enforces all four constraints natively.

From research, not from marketing.

Built on a decade of independent research into data leakage, causal inference, and ML methodology. Published, falsifiable, and open to critique.

Releases only.

Major versions and research updates. No noise.

No spam. Unsubscribe any time.