Inference Contract¶
Version: 1.0 Status: Stable Last Updated: 2026-03-28
This document defines the canonical inference pipeline order for taskclf and maps each stage to its implementation across the three runtime paths.
1. Canonical Pipeline Order¶
Every inference path must execute these stages in this order:
features → encode categoricals → impute missing → predict probabilities
→ calibrate → reject → smooth/aggregate → taxonomy map → UI label
| # | Stage | Input | Output |
|---|---|---|---|
| 1 | Build features | Raw events | FeatureRow / feature DataFrame |
| 2 | Encode categoricals | String columns | Integer-encoded columns |
| 3 | Impute missing values | Encoded matrix (with gaps) | Dense numeric matrix |
| 4 | Predict probabilities | Dense matrix | (N, n_classes) probability matrix |
| 5 | Calibrate | Raw probabilities | Calibrated probabilities |
| 6 | Reject | Calibrated confidence | Rejection flags + Mixed/Unknown labels |
| 7 | Smooth / aggregate | Per-bucket labels | Rolling-majority-smoothed labels |
| 8 | Taxonomy map | Core labels + calibrated probs | Mapped labels + mapped probs |
| 9 | UI label | Mapped or smoothed label | Final user-facing string |
2. Runtime Paths¶
2.1 Batch¶
Entry point: run_batch_inference() in src/taskclf/infer/batch.py.
| Stage | Implementation |
|---|---|
| Build features | Caller supplies a pre-built features_df. |
| Encode categoricals | predict_proba() → encode_categoricals() from train/lgbm.py. |
| Impute missing | predict_proba() → fillna(0). |
| Predict probabilities | predict_proba() → model.predict(x). |
| Calibrate | CalibratorStore.calibrate_batch() (per-user) or single Calibrator.calibrate(). |
| Reject | max(proba) < reject_threshold → label becomes Mixed/Unknown. |
| Smooth / aggregate | rolling_majority() from infer/smooth.py, then segmentize() + merge_short_segments(). |
| Taxonomy map | TaxonomyResolver.resolve_batch() from infer/taxonomy.py. |
| UI label | BatchInferenceResult.mapped_labels (or smoothed_labels if no taxonomy). |
Helper functions predict_proba() and predict_labels() cover only
stages 2–4 (encode, impute, predict). They do not calibrate, reject,
smooth, or map. They are used for evaluation and calibrator fitting,
not end-user inference.
2.2 Online¶
Entry point: OnlinePredictor.predict_bucket() in src/taskclf/infer/online.py.
| Stage | Implementation |
|---|---|
| Build features | run_online_loop() calls build_features_from_aw_events(). |
| Encode categoricals | _encode_value() per column (categorical → LabelEncoder, unknown → -1). |
| Impute missing | _encode_value() returns float("nan") for missing numerics. |
| Predict probabilities | model.predict(x) on the single-row array. |
| Calibrate | CalibratorStore.get_calibrator(row.user_id) or fallback Calibrator.calibrate(). |
| Reject | confidence < reject_threshold → label becomes Mixed/Unknown. |
| Smooth / aggregate | rolling_majority() over an internal deque buffer. |
| Taxonomy map | TaxonomyResolver.resolve() per bucket. |
| UI label | WindowPrediction.mapped_label_name (or smoothed_label if no taxonomy). |
2.3 Tray¶
Entry point: _LabelSuggester.suggest() in src/taskclf/ui/runtime.py.
Delegates entirely to the online path:
- Fetches AW events for the requested time window.
- Builds features via
build_features_from_aw_events(). - Calls
OnlinePredictor.predict_bucket()on the last row. - Returns
(prediction.core_label_name, prediction.confidence).
The tray path inherits all online-path stages. Its return value
currently exposes core_label_name, not mapped_label_name.
3. Known Deviations¶
These deviations from the canonical order are documented here and addressed in Phase 1.
3.1 Imputation mismatch (batch vs online)¶
- Batch
predict_proba()usesfillna(0)— missing numerics become zero. - Online
_encode_value()returnsfloat("nan")— missing numerics stay NaN. - Training
prepare_xy()intrain/lgbm.pyusesfillna(0).
Batch matches training; online does not. LightGBM handles NaN natively (routes to best split), so the model tolerates this, but predictions may differ for identical inputs depending on path.
3.2 predict_proba / predict_labels skip calibration¶
predict_proba() and predict_labels() execute only encode → impute →
predict. They do not calibrate. This is intentional for evaluation and
calibrator fitting, but callers must not treat their output as
production-grade calibrated predictions.
3.3 Taxonomy inputs are pre-smooth¶
In both batch and online, taxonomy mapping is sequenced after smoothing
in the code flow. However, TaxonomyResolver.resolve() /
resolve_batch() receive the per-bucket argmax index and calibrated
probabilities — not the smoothed label. Taxonomy therefore maps from
the raw (pre-smooth) prediction, not the smoothed one.
3.4 Tray suggest returns core_label_name¶
_LabelSuggester.suggest() returns core_label_name and confidence.
It does not return mapped_label_name, so taxonomy configuration is
invisible to the tray UI return value.