train.calibrate¶
Per-user probability calibration: eligibility checks and calibrator fitting.
Overview¶
Training-side logic for the personalization pipeline. After a model is trained, this module fits probability calibrators on validation data so that predicted confidences better reflect true accuracy:
model + labeled_df → predict → fit global calibrator
→ check each user's eligibility
→ fit per-user calibrators (eligible users only)
→ CalibratorStore + eligibility reports
The resulting CalibratorStore is used at
inference time to adjust raw model probabilities before the reject
decision.
Models¶
PersonalizationEligibility¶
Frozen Pydantic model reporting whether a user qualifies for per-user calibration.
| Field | Type | Description |
|---|---|---|
user_id |
str |
User identifier |
labeled_windows |
int |
Number of labeled windows for this user |
labeled_days |
int |
Number of distinct calendar days with labels |
distinct_labels |
int |
Number of distinct core labels observed |
is_eligible |
bool |
Whether all thresholds are met |
Eligibility thresholds¶
A user must meet all three thresholds to receive a per-user calibrator.
Defaults are defined in core.defaults:
| Threshold | Default | Description |
|---|---|---|
min_windows |
DEFAULT_MIN_LABELED_WINDOWS (200) |
Minimum labeled window count |
min_days |
DEFAULT_MIN_LABELED_DAYS (3) |
Minimum distinct calendar days |
min_labels |
DEFAULT_MIN_DISTINCT_LABELS (3) |
Minimum distinct core labels |
Ineligible users fall back to the global calibrator at inference time.
Functions¶
check_personalization_eligible¶
check_personalization_eligible(
df: pd.DataFrame,
user_id: str,
*,
min_windows: int = DEFAULT_MIN_LABELED_WINDOWS,
min_days: int = DEFAULT_MIN_LABELED_DAYS,
min_labels: int = DEFAULT_MIN_DISTINCT_LABELS,
) -> PersonalizationEligibility
Checks whether user_id has enough labeled data. Returns a
PersonalizationEligibility report. If the user is not present in
df, returns a report with all counts at 0 and is_eligible=False.
fit_temperature_calibrator¶
fit_temperature_calibrator(
y_true_indices: np.ndarray,
y_proba: np.ndarray,
) -> TemperatureCalibrator
Finds the temperature scalar that minimizes negative log-likelihood on validation data. Uses a two-pass grid search:
- Coarse: 0.1 to 5.0, step 0.1
- Fine: best ± 0.1, step 0.01
Returns a TemperatureCalibrator with the optimal temperature.
fit_isotonic_calibrator¶
fit_isotonic_calibrator(
y_true_indices: np.ndarray,
y_proba: np.ndarray,
n_classes: int,
) -> IsotonicCalibrator
Fits per-class sklearn.isotonic.IsotonicRegression with
y_min=0.0, y_max=1.0, out_of_bounds="clip". Returns an
IsotonicCalibrator wrapping the fitted regressors.
fit_calibrator_store¶
fit_calibrator_store(
model: lgb.Booster,
labeled_df: pd.DataFrame,
*,
cat_encoders: dict[str, LabelEncoder] | None = None,
method: Literal["temperature", "isotonic"] = DEFAULT_CALIBRATION_METHOD,
min_windows: int = DEFAULT_MIN_LABELED_WINDOWS,
min_days: int = DEFAULT_MIN_LABELED_DAYS,
min_labels: int = DEFAULT_MIN_DISTINCT_LABELS,
) -> tuple[CalibratorStore, list[PersonalizationEligibility]]
Orchestrates the full calibration flow:
- Predicts on
labeled_dfto get raw probabilities - Fits a global calibrator on all validation data
- Checks each user's eligibility
- Fits per-user calibrators for qualifying users
Returns a (CalibratorStore, eligibility_reports) tuple. The
default calibration method is "temperature"
(from core.defaults).
Method comparison¶
| Temperature | Isotonic | |
|---|---|---|
| Parameters | Single scalar T |
Per-class non-parametric regression |
| Size | Lightweight (one float) | Larger (one IsotonicRegression per class) |
| Flexibility | Uniform scaling across all classes | Independent adjustment per class |
| Best for | Well-calibrated models needing minor adjustment | Models with class-specific miscalibration |
| Risk | Cannot fix per-class bias | Can overfit with small validation sets |
Usage¶
from taskclf.train.calibrate import fit_calibrator_store
from taskclf.infer.calibration import save_calibrator_store
store, reports = fit_calibrator_store(
model, val_df,
cat_encoders=cat_encoders,
method="temperature",
)
for r in reports:
print(f"{r.user_id}: eligible={r.is_eligible}")
save_calibrator_store(store, Path("artifacts/calibrator_store"))
See the personalization guide for
end-to-end setup and the
infer.calibration page for runtime
calibrator usage.
taskclf.train.calibrate
¶
Per-user probability calibration: eligibility checks and calibrator fitting.
Provides the training-side logic for the personalization pipeline:
- :func:
check_personalization_eligible— gate that ensures a user has enough labeled data before fitting a per-user calibrator. - :func:
fit_temperature_calibrator— optimizes a temperature scalar that minimizes NLL on held-out probabilities. - :func:
fit_isotonic_calibrator— fits per-class isotonic regression. - :func:
fit_calibrator_store— orchestrates the full flow: predict on validation data, fit a global calibrator, check each user's eligibility, and fit per-user calibrators for qualifying users.
PersonalizationEligibility
¶
Bases: BaseModel
Result of checking whether a user qualifies for per-user calibration.
Source code in src/taskclf/train/calibrate.py
check_personalization_eligible(df, user_id, *, min_windows=DEFAULT_MIN_LABELED_WINDOWS, min_days=DEFAULT_MIN_LABELED_DAYS, min_labels=DEFAULT_MIN_DISTINCT_LABELS)
¶
Check whether user_id has enough labeled data for per-user calibration.
The eligibility thresholds follow docs/guide/acceptance.md Section 8.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
Labeled DataFrame with |
required |
user_id
|
str
|
The user to check. |
required |
min_windows
|
int
|
Minimum labeled window count. |
DEFAULT_MIN_LABELED_WINDOWS
|
min_days
|
int
|
Minimum number of distinct calendar days. |
DEFAULT_MIN_LABELED_DAYS
|
min_labels
|
int
|
Minimum number of distinct core labels observed. |
DEFAULT_MIN_DISTINCT_LABELS
|
Returns:
| Name | Type | Description |
|---|---|---|
A |
PersonalizationEligibility
|
class: |
Source code in src/taskclf/train/calibrate.py
fit_temperature_calibrator(y_true_indices, y_proba)
¶
Find the temperature that minimizes NLL on validation data.
Uses a two-pass grid search: coarse (0.1–5.0 step 0.1), then fine (±0.1 around best at step 0.01).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
y_true_indices
|
ndarray
|
Integer-encoded true labels, shape |
required |
y_proba
|
ndarray
|
Raw model probabilities, shape |
required |
Returns:
| Type | Description |
|---|---|
TemperatureCalibrator
|
A fitted :class: |
Source code in src/taskclf/train/calibrate.py
fit_isotonic_calibrator(y_true_indices, y_proba, n_classes)
¶
Fit per-class isotonic regression on validation data.
For each class c, fits IsotonicRegression on
(y_proba[:, c], (y_true == c).astype(float)).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
y_true_indices
|
ndarray
|
Integer-encoded true labels, shape |
required |
y_proba
|
ndarray
|
Raw model probabilities, shape |
required |
n_classes
|
int
|
Number of classes. |
required |
Returns:
| Type | Description |
|---|---|
IsotonicCalibrator
|
A fitted :class: |
Source code in src/taskclf/train/calibrate.py
fit_calibrator_store(model, labeled_df, *, cat_encoders=None, method=DEFAULT_CALIBRATION_METHOD, min_windows=DEFAULT_MIN_LABELED_WINDOWS, min_days=DEFAULT_MIN_LABELED_DAYS, min_labels=DEFAULT_MIN_DISTINCT_LABELS, model_bundle_id=None, model_schema_hash=None)
¶
Fit a global calibrator and per-user calibrators for eligible users.
- Predicts on the full labeled_df to get raw probabilities.
- Fits a global calibrator on all validation data.
- Checks each user's eligibility; for eligible users fits a per-user calibrator.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
Booster
|
Trained LightGBM booster (frozen — not retrained). |
required |
labeled_df
|
DataFrame
|
Labeled validation DataFrame with |
required |
cat_encoders
|
dict[str, LabelEncoder] | None
|
Pre-fitted categorical encoders from training. |
None
|
method
|
Literal['temperature', 'isotonic']
|
Calibration method — |
DEFAULT_CALIBRATION_METHOD
|
min_windows
|
int
|
Minimum labeled windows for per-user eligibility. |
DEFAULT_MIN_LABELED_WINDOWS
|
min_days
|
int
|
Minimum distinct days for per-user eligibility. |
DEFAULT_MIN_LABELED_DAYS
|
min_labels
|
int
|
Minimum distinct labels for per-user eligibility. |
DEFAULT_MIN_DISTINCT_LABELS
|
model_bundle_id
|
str | None
|
Run directory name of the model bundle. Recorded in the store for traceability. |
None
|
model_schema_hash
|
str | None
|
Schema hash of the model bundle. Recorded in the store so the inference policy can validate compatibility. |
None
|
Returns:
| Type | Description |
|---|---|
CalibratorStore
|
|
list[PersonalizationEligibility]
|
and a list of :class: |
tuple[CalibratorStore, list[PersonalizationEligibility]]
|
unique user in labeled_df. |
Source code in src/taskclf/train/calibrate.py
186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 | |