train.calibrate¶

Per-user probability calibration: eligibility checks and calibrator fitting.

Overview¶

Training-side logic for the personalization pipeline. After a model is trained, this module fits probability calibrators on validation data so that predicted confidences better reflect true accuracy:

model + labeled_df → predict → fit global calibrator
                             → check each user's eligibility
                             → fit per-user calibrators (eligible users only)
                             → CalibratorStore + eligibility reports

The resulting CalibratorStore is used at inference time to adjust raw model probabilities before the reject decision.

Models¶

PersonalizationEligibility¶

Frozen Pydantic model reporting whether a user qualifies for per-user calibration.

Field	Type	Description
`user_id`	`str`	User identifier
`labeled_windows`	`int`	Number of labeled windows for this user
`labeled_days`	`int`	Number of distinct calendar days with labels
`distinct_labels`	`int`	Number of distinct core labels observed
`is_eligible`	`bool`	Whether all thresholds are met

Eligibility thresholds¶

A user must meet all three thresholds to receive a per-user calibrator. Defaults are defined in core.defaults:

Threshold	Default	Description
`min_windows`	`DEFAULT_MIN_LABELED_WINDOWS` (200)	Minimum labeled window count
`min_days`	`DEFAULT_MIN_LABELED_DAYS` (3)	Minimum distinct calendar days
`min_labels`	`DEFAULT_MIN_DISTINCT_LABELS` (3)	Minimum distinct core labels

Ineligible users fall back to the global calibrator at inference time.

Functions¶

check_personalization_eligible¶

check_personalization_eligible(
    df: pd.DataFrame,
    user_id: str,
    *,
    min_windows: int = DEFAULT_MIN_LABELED_WINDOWS,
    min_days: int = DEFAULT_MIN_LABELED_DAYS,
    min_labels: int = DEFAULT_MIN_DISTINCT_LABELS,
) -> PersonalizationEligibility

Checks whether user_id has enough labeled data. Returns a PersonalizationEligibility report. If the user is not present in df, returns a report with all counts at 0 and is_eligible=False.

fit_temperature_calibrator¶

fit_temperature_calibrator(
    y_true_indices: np.ndarray,
    y_proba: np.ndarray,
) -> TemperatureCalibrator

Finds the temperature scalar that minimizes negative log-likelihood on validation data. Uses a two-pass grid search:

Coarse: 0.1 to 5.0, step 0.1
Fine: best ± 0.1, step 0.01

Returns a TemperatureCalibrator with the optimal temperature.

fit_isotonic_calibrator¶

fit_isotonic_calibrator(
    y_true_indices: np.ndarray,
    y_proba: np.ndarray,
    n_classes: int,
) -> IsotonicCalibrator

Fits per-class sklearn.isotonic.IsotonicRegression with y_min=0.0, y_max=1.0, out_of_bounds="clip". Returns an IsotonicCalibrator wrapping the fitted regressors.

fit_calibrator_store¶

fit_calibrator_store(
    model: lgb.Booster,
    labeled_df: pd.DataFrame,
    *,
    cat_encoders: dict[str, LabelEncoder] | None = None,
    method: Literal["temperature", "isotonic"] = DEFAULT_CALIBRATION_METHOD,
    min_windows: int = DEFAULT_MIN_LABELED_WINDOWS,
    min_days: int = DEFAULT_MIN_LABELED_DAYS,
    min_labels: int = DEFAULT_MIN_DISTINCT_LABELS,
) -> tuple[CalibratorStore, list[PersonalizationEligibility]]

Orchestrates the full calibration flow:

Predicts on labeled_df to get raw probabilities
Fits a global calibrator on all validation data
Checks each user's eligibility
Fits per-user calibrators for qualifying users

Returns a (CalibratorStore, eligibility_reports) tuple. The default calibration method is "temperature" (from core.defaults).

Method comparison¶

	Temperature	Isotonic
Parameters	Single scalar `T`	Per-class non-parametric regression
Size	Lightweight (one float)	Larger (one `IsotonicRegression` per class)
Flexibility	Uniform scaling across all classes	Independent adjustment per class
Best for	Well-calibrated models needing minor adjustment	Models with class-specific miscalibration
Risk	Cannot fix per-class bias	Can overfit with small validation sets

Usage¶

from taskclf.train.calibrate import fit_calibrator_store
from taskclf.infer.calibration import save_calibrator_store

store, reports = fit_calibrator_store(
    model, val_df,
    cat_encoders=cat_encoders,
    method="temperature",
)

for r in reports:
    print(f"{r.user_id}: eligible={r.is_eligible}")

save_calibrator_store(store, Path("artifacts/calibrator_store"))

See the personalization guide for end-to-end setup and the infer.calibration page for runtime calibrator usage.

`taskclf.train.calibrate` ¶

Per-user probability calibration: eligibility checks and calibrator fitting.

Provides the training-side logic for the personalization pipeline:

:func:check_personalization_eligible — gate that ensures a user has enough labeled data before fitting a per-user calibrator.
:func:fit_temperature_calibrator — optimizes a temperature scalar that minimizes NLL on held-out probabilities.
:func:fit_isotonic_calibrator — fits per-class isotonic regression.
:func:fit_calibrator_store — orchestrates the full flow: predict on validation data, fit a global calibrator, check each user's eligibility, and fit per-user calibrators for qualifying users.

`PersonalizationEligibility` ¶

Bases: BaseModel

Result of checking whether a user qualifies for per-user calibration.

Source code in src/taskclf/train/calibrate.py

class PersonalizationEligibility(BaseModel, frozen=True):
    """Result of checking whether a user qualifies for per-user calibration."""

    user_id: str
    labeled_windows: int
    labeled_days: int
    distinct_labels: int
    is_eligible: bool

`check_personalization_eligible(df, user_id, *, min_windows=DEFAULT_MIN_LABELED_WINDOWS, min_days=DEFAULT_MIN_LABELED_DAYS, min_labels=DEFAULT_MIN_DISTINCT_LABELS)` ¶

Check whether user_id has enough labeled data for per-user calibration.

The eligibility thresholds follow docs/guide/acceptance.md Section 8.

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	Labeled DataFrame with `user_id`, `bucket_start_ts`, and `label` columns.	required
`user_id`	`str`	The user to check.	required
`min_windows`	`int`	Minimum labeled window count.	`DEFAULT_MIN_LABELED_WINDOWS`
`min_days`	`int`	Minimum number of distinct calendar days.	`DEFAULT_MIN_LABELED_DAYS`
`min_labels`	`int`	Minimum number of distinct core labels observed.	`DEFAULT_MIN_DISTINCT_LABELS`

Returns:

Name	Type	Description
`A`	`PersonalizationEligibility`	class:`PersonalizationEligibility` report.

Source code in src/taskclf/train/calibrate.py

def check_personalization_eligible(
    df: pd.DataFrame,
    user_id: str,
    *,
    min_windows: int = DEFAULT_MIN_LABELED_WINDOWS,
    min_days: int = DEFAULT_MIN_LABELED_DAYS,
    min_labels: int = DEFAULT_MIN_DISTINCT_LABELS,
) -> PersonalizationEligibility:
    """Check whether *user_id* has enough labeled data for per-user calibration.

    The eligibility thresholds follow ``docs/guide/acceptance.md`` Section 8.

    Args:
        df: Labeled DataFrame with ``user_id``, ``bucket_start_ts``, and
            ``label`` columns.
        user_id: The user to check.
        min_windows: Minimum labeled window count.
        min_days: Minimum number of distinct calendar days.
        min_labels: Minimum number of distinct core labels observed.

    Returns:
        A :class:`PersonalizationEligibility` report.
    """
    user_df = df[df["user_id"] == user_id]
    n_windows = len(user_df)

    if n_windows == 0:
        return PersonalizationEligibility(
            user_id=user_id,
            labeled_windows=0,
            labeled_days=0,
            distinct_labels=0,
            is_eligible=False,
        )

    n_days = user_df["bucket_start_ts"].dt.date.nunique()
    n_labels = user_df["label"].nunique()
    eligible = (
        n_windows >= min_windows and n_days >= min_days and n_labels >= min_labels
    )

    return PersonalizationEligibility(
        user_id=user_id,
        labeled_windows=n_windows,
        labeled_days=n_days,
        distinct_labels=n_labels,
        is_eligible=eligible,
    )

`fit_temperature_calibrator(y_true_indices, y_proba)` ¶

Find the temperature that minimizes NLL on validation data.

Uses a two-pass grid search: coarse (0.1–5.0 step 0.1), then fine (±0.1 around best at step 0.01).

Parameters:

Name	Type	Description	Default
`y_true_indices`	`ndarray`	Integer-encoded true labels, shape `(n,)`.	required
`y_proba`	`ndarray`	Raw model probabilities, shape `(n, n_classes)`.	required

Returns:

Type	Description
`TemperatureCalibrator`	A fitted :class:`TemperatureCalibrator`.

Source code in src/taskclf/train/calibrate.py

def fit_temperature_calibrator(
    y_true_indices: np.ndarray,
    y_proba: np.ndarray,
) -> TemperatureCalibrator:
    """Find the temperature that minimizes NLL on validation data.

    Uses a two-pass grid search: coarse (0.1–5.0 step 0.1), then fine
    (±0.1 around best at step 0.01).

    Args:
        y_true_indices: Integer-encoded true labels, shape ``(n,)``.
        y_proba: Raw model probabilities, shape ``(n, n_classes)``.

    Returns:
        A fitted :class:`TemperatureCalibrator`.
    """
    eps = 1e-12
    logits = np.log(np.clip(y_proba, eps, None))

    def _apply_temp(t: float) -> float:
        scaled = logits / t
        shifted = scaled - scaled.max(axis=-1, keepdims=True)
        exp_vals = np.exp(shifted)
        probs = exp_vals / exp_vals.sum(axis=-1, keepdims=True)
        return _nll(probs, y_true_indices)

    # Coarse pass
    best_t = 1.0
    best_nll = _apply_temp(1.0)
    for t in np.arange(0.1, 5.05, 0.1):
        nll = _apply_temp(float(t))
        if nll < best_nll:
            best_nll = nll
            best_t = float(t)

    # Fine pass around best
    lo = max(0.01, best_t - 0.1)
    hi = best_t + 0.1
    for t in np.arange(lo, hi + 0.005, 0.01):
        nll = _apply_temp(float(t))
        if nll < best_nll:
            best_nll = nll
            best_t = float(t)

    return TemperatureCalibrator(temperature=round(best_t, 4))

`fit_isotonic_calibrator(y_true_indices, y_proba, n_classes)` ¶

Fit per-class isotonic regression on validation data.

For each class c, fits IsotonicRegression on (y_proba[:, c], (y_true == c).astype(float)).

Parameters:

Name	Type	Description	Default
`y_true_indices`	`ndarray`	Integer-encoded true labels, shape `(n,)`.	required
`y_proba`	`ndarray`	Raw model probabilities, shape `(n, n_classes)`.	required
`n_classes`	`int`	Number of classes.	required

Returns:

Type	Description
`IsotonicCalibrator`	A fitted :class:`IsotonicCalibrator`.

Source code in src/taskclf/train/calibrate.py

def fit_isotonic_calibrator(
    y_true_indices: np.ndarray,
    y_proba: np.ndarray,
    n_classes: int,
) -> IsotonicCalibrator:
    """Fit per-class isotonic regression on validation data.

    For each class *c*, fits ``IsotonicRegression`` on
    ``(y_proba[:, c], (y_true == c).astype(float))``.

    Args:
        y_true_indices: Integer-encoded true labels, shape ``(n,)``.
        y_proba: Raw model probabilities, shape ``(n, n_classes)``.
        n_classes: Number of classes.

    Returns:
        A fitted :class:`IsotonicCalibrator`.
    """
    regressors: list[IsotonicRegression] = []
    for c in range(n_classes):
        binary_target = (y_true_indices == c).astype(np.float64)
        reg = IsotonicRegression(y_min=0.0, y_max=1.0, out_of_bounds="clip")
        reg.fit(y_proba[:, c], binary_target)
        regressors.append(reg)
    return IsotonicCalibrator(regressors)

`fit_calibrator_store(model, labeled_df, *, cat_encoders=None, method=DEFAULT_CALIBRATION_METHOD, min_windows=DEFAULT_MIN_LABELED_WINDOWS, min_days=DEFAULT_MIN_LABELED_DAYS, min_labels=DEFAULT_MIN_DISTINCT_LABELS, model_bundle_id=None, model_schema_hash=None)` ¶

Fit a global calibrator and per-user calibrators for eligible users.

Predicts on the full labeled_df to get raw probabilities.
Fits a global calibrator on all validation data.
Checks each user's eligibility; for eligible users fits a per-user calibrator.

Parameters:

Name	Type	Description	Default
`model`	`Booster`	Trained LightGBM booster (frozen — not retrained).	required
`labeled_df`	`DataFrame`	Labeled validation DataFrame with `user_id`, `bucket_start_ts`, `label`, and `FEATURE_COLUMNS`.	required
`cat_encoders`	`dict[str, LabelEncoder] \| None`	Pre-fitted categorical encoders from training.	`None`
`method`	`Literal['temperature', 'isotonic']`	Calibration method — `"temperature"` or `"isotonic"`.	`DEFAULT_CALIBRATION_METHOD`
`min_windows`	`int`	Minimum labeled windows for per-user eligibility.	`DEFAULT_MIN_LABELED_WINDOWS`
`min_days`	`int`	Minimum distinct days for per-user eligibility.	`DEFAULT_MIN_LABELED_DAYS`
`min_labels`	`int`	Minimum distinct labels for per-user eligibility.	`DEFAULT_MIN_DISTINCT_LABELS`
`model_bundle_id`	`str \| None`	Run directory name of the model bundle. Recorded in the store for traceability.	`None`
`model_schema_hash`	`str \| None`	Schema hash of the model bundle. Recorded in the store so the inference policy can validate compatibility.	`None`

Returns:

Type	Description
`CalibratorStore`	`(store, eligibility_reports)` — a :class:`CalibratorStore`
`list[PersonalizationEligibility]`	and a list of :class:`PersonalizationEligibility` for every
`tuple[CalibratorStore, list[PersonalizationEligibility]]`	unique user in labeled_df.

Source code in src/taskclf/train/calibrate.py

def fit_calibrator_store(
    model: lgb.Booster,
    labeled_df: pd.DataFrame,
    *,
    cat_encoders: dict[str, LabelEncoder] | None = None,
    method: Literal["temperature", "isotonic"] = DEFAULT_CALIBRATION_METHOD,  # type: ignore[assignment]
    min_windows: int = DEFAULT_MIN_LABELED_WINDOWS,
    min_days: int = DEFAULT_MIN_LABELED_DAYS,
    min_labels: int = DEFAULT_MIN_DISTINCT_LABELS,
    model_bundle_id: str | None = None,
    model_schema_hash: str | None = None,
) -> tuple[CalibratorStore, list[PersonalizationEligibility]]:
    """Fit a global calibrator and per-user calibrators for eligible users.

    1. Predicts on the full *labeled_df* to get raw probabilities.
    2. Fits a global calibrator on all validation data.
    3. Checks each user's eligibility; for eligible users fits a
       per-user calibrator.

    Args:
        model: Trained LightGBM booster (frozen — not retrained).
        labeled_df: Labeled validation DataFrame with ``user_id``,
            ``bucket_start_ts``, ``label``, and ``FEATURE_COLUMNS``.
        cat_encoders: Pre-fitted categorical encoders from training.
        method: Calibration method — ``"temperature"`` or ``"isotonic"``.
        min_windows: Minimum labeled windows for per-user eligibility.
        min_days: Minimum distinct days for per-user eligibility.
        min_labels: Minimum distinct labels for per-user eligibility.
        model_bundle_id: Run directory name of the model bundle.
            Recorded in the store for traceability.
        model_schema_hash: Schema hash of the model bundle.  Recorded
            in the store so the inference policy can validate
            compatibility.

    Returns:
        ``(store, eligibility_reports)`` — a :class:`CalibratorStore`
        and a list of :class:`PersonalizationEligibility` for every
        unique user in *labeled_df*.
    """
    le = LabelEncoder()
    le.fit(sorted(LABEL_SET_V1))
    n_classes = len(le.classes_)

    y_proba = predict_proba(model, labeled_df, cat_encoders)
    y_true = le.transform(labeled_df["label"].values)

    global_cal: TemperatureCalibrator | IsotonicCalibrator
    if method == "isotonic":
        global_cal = fit_isotonic_calibrator(y_true, y_proba, n_classes)
    else:
        global_cal = fit_temperature_calibrator(y_true, y_proba)

    # Per-user eligibility and calibration
    user_ids = sorted(labeled_df["user_id"].unique())
    eligibility_reports: list[PersonalizationEligibility] = []
    user_calibrators: dict[str, Calibrator] = {}

    for uid in user_ids:
        elig = check_personalization_eligible(
            labeled_df,
            uid,
            min_windows=min_windows,
            min_days=min_days,
            min_labels=min_labels,
        )
        eligibility_reports.append(elig)

        if not elig.is_eligible:
            logger.info(
                "User %s ineligible (windows=%d, days=%d, labels=%d)",
                uid,
                elig.labeled_windows,
                elig.labeled_days,
                elig.distinct_labels,
            )
            continue

        mask = labeled_df["user_id"].values == uid
        user_proba = y_proba[mask]
        user_true = y_true[mask]

        user_cal: TemperatureCalibrator | IsotonicCalibrator
        if method == "isotonic":
            user_cal = fit_isotonic_calibrator(user_true, user_proba, n_classes)
        else:
            user_cal = fit_temperature_calibrator(user_true, user_proba)

        user_calibrators[uid] = user_cal
        logger.info("Fitted %s calibrator for user %s", method, uid)

    from datetime import UTC, datetime

    store = CalibratorStore(
        global_calibrator=global_cal,
        user_calibrators=user_calibrators,
        method=method,
        model_bundle_id=model_bundle_id,
        model_schema_hash=model_schema_hash,
        created_at=datetime.now(UTC).isoformat(),
    )
    return store, eligibility_reports

train.calibrate¶

Overview¶

Models¶

PersonalizationEligibility¶

Eligibility thresholds¶

Functions¶

check_personalization_eligible¶

fit_temperature_calibrator¶

fit_isotonic_calibrator¶

fit_calibrator_store¶

Method comparison¶

Usage¶

taskclf.train.calibrate ¶

PersonalizationEligibility ¶

check_personalization_eligible(df, user_id, *, min_windows=DEFAULT_MIN_LABELED_WINDOWS, min_days=DEFAULT_MIN_LABELED_DAYS, min_labels=DEFAULT_MIN_DISTINCT_LABELS) ¶

fit_temperature_calibrator(y_true_indices, y_proba) ¶

fit_isotonic_calibrator(y_true_indices, y_proba, n_classes) ¶

fit_calibrator_store(model, labeled_df, *, cat_encoders=None, method=DEFAULT_CALIBRATION_METHOD, min_windows=DEFAULT_MIN_LABELED_WINDOWS, min_days=DEFAULT_MIN_LABELED_DAYS, min_labels=DEFAULT_MIN_DISTINCT_LABELS, model_bundle_id=None, model_schema_hash=None) ¶

`taskclf.train.calibrate` ¶

`PersonalizationEligibility` ¶

`check_personalization_eligible(df, user_id, *, min_windows=DEFAULT_MIN_LABELED_WINDOWS, min_days=DEFAULT_MIN_LABELED_DAYS, min_labels=DEFAULT_MIN_DISTINCT_LABELS)` ¶

`fit_temperature_calibrator(y_true_indices, y_proba)` ¶

`fit_isotonic_calibrator(y_true_indices, y_proba, n_classes)` ¶

`fit_calibrator_store(model, labeled_df, *, cat_encoders=None, method=DEFAULT_CALIBRATION_METHOD, min_windows=DEFAULT_MIN_LABELED_WINDOWS, min_days=DEFAULT_MIN_LABELED_DAYS, min_labels=DEFAULT_MIN_DISTINCT_LABELS, model_bundle_id=None, model_schema_hash=None)` ¶