Skip to content

infer.resolve

Model resolution for inference: map CLI arguments to a concrete model bundle directory, with automatic fallback and hot-reload support.

Overview

The resolve module bridges the gap between the CLI --model-dir option and the model registry. At inference startup the caller may or may not specify an explicit model directory. resolve_model_dir applies a deterministic precedence chain to guarantee either a valid bundle path or a descriptive error:

explicit --model-dir ──► validate path exists ──► return
        │ (None)
  active.json exists? ──► yes ──► use active pointer
        │ (no / stale)
  find_best_model() ──► best eligible bundle ──► self-heal active.json ──► return
        │ (none eligible)
  raise ModelResolutionError (with exclusion reasons)

For long-running online inference loops, ActiveModelReloader watches active.json for mtime changes and transparently reloads the model bundle without restarting the process. ActiveModelReloader is implemented as a dataclass; public constructor parameters (models_dir, check_interval_s) are unchanged.

Resolution precedence

Priority Condition Behaviour
1 model_dir argument provided Validate the path exists; return it directly
2 models/active.json present and valid Use the active pointer from the registry
3 No active pointer but eligible bundles exist Select best by macro_f1; self-heal active.json
4 No eligible bundles Raise ModelResolutionError with per-bundle exclusion reasons

ModelResolutionError

Custom exception raised when no model can be resolved.

Attribute Type Description
message str Human-readable error with actionable guidance
report SelectionReport or None Full selection report including excluded records with per-bundle reasons (attached when resolution went through the registry)

resolve_model_dir

def resolve_model_dir(
    model_dir: str | None,
    models_dir: Path,
    policy: SelectionPolicy | None = None,
) -> Path
Parameter Type Default Description
model_dir str or None Explicit --model-dir value from CLI; None triggers automatic resolution
models_dir Path Base directory containing promoted model bundles
policy SelectionPolicy or None None Selection policy override; when None, the registry uses policy v1

Returns: Path to the resolved model bundle directory.

Raises: ModelResolutionError when no model can be resolved. The error message includes the list of excluded bundles and their exclusion reasons when available.

ActiveModelReloader

Lightweight mtime-based watcher designed for the online inference loop. Polls active.json at a configurable interval and, when a change is detected, loads the new model bundle via load_model_bundle. On failure the current model is kept and a warning is logged.

Constructor

ActiveModelReloader(
    models_dir: Path,
    check_interval_s: float = 60.0,
)
Parameter Type Default Description
models_dir Path Directory containing active.json
check_interval_s float 60.0 Minimum seconds between mtime checks; prevents excessive stat calls

check_reload

def check_reload(self) -> tuple[Booster, ModelMetadata, dict[str, LabelEncoder] | None] | None

Returns (model, metadata, cat_encoders) when a reload succeeds. Returns None in three cases:

  1. The check interval has not elapsed since the last check.
  2. The active.json mtime is unchanged.
  3. The mtime changed but the reload failed (a warning is logged and the caller should keep the current model).

After a successful reload the internal mtime is updated, so a second immediate call returns None.

Usage

Resolve at startup

from pathlib import Path
from taskclf.infer.resolve import resolve_model_dir, ModelResolutionError

try:
    bundle_path = resolve_model_dir(
        model_dir=None,           # let the registry decide
        models_dir=Path("models/"),
    )
except ModelResolutionError as exc:
    print(exc)
    if exc.report and exc.report.excluded:
        for rec in exc.report.excluded:
            print(f"  {rec.model_id}: {rec.reason}")
    raise SystemExit(1)

Hot-reload in an online loop

from pathlib import Path
from taskclf.infer.resolve import ActiveModelReloader

reloader = ActiveModelReloader(Path("models/"), check_interval_s=30.0)

# inside the polling loop
result = reloader.check_reload()
if result is not None:
    model, metadata, cat_encoders = result
    # swap to the new model for subsequent predictions

See also

  • model_registry — bundle scanning, ranking, and active pointer management
  • core.model_ioload_model_bundle and ModelMetadata
  • infer.online — online inference loop that uses resolve_model_dir and ActiveModelReloader

taskclf.infer.resolve

Model resolution for inference: resolve --model-dir and hot-reload.

Bridges CLI arguments to the model registry, providing:

  • :func:resolve_model_dir — resolve an optional --model-dir to a concrete :class:~pathlib.Path using the active pointer / best-model selection fallback. Deprecated — prefer :func:resolve_inference_config.
  • :func:resolve_inference_config — resolve the full inference configuration (model + calibrator + threshold) from an :class:~taskclf.core.inference_policy.InferencePolicy.
  • :class:ActiveModelReloader — lightweight mtime-based watcher that detects active.json changes and reloads the model bundle for long-running online inference loops. Deprecated — prefer :class:InferencePolicyReloader.
  • :class:InferencePolicyReloader — watches inference_policy.json (falling back to active.json) and reloads the full inference config on change.

ModelResolutionError dataclass

Bases: Exception

Raised when no model can be resolved for inference.

Source code in src/taskclf/infer/resolve.py
@dataclass(eq=False)
class ModelResolutionError(Exception):
    """Raised when no model can be resolved for inference."""

    message: str
    report: SelectionReport | None = None

    def __post_init__(self) -> None:
        super().__init__(self.message)

ActiveModelReloader dataclass

Watch active.json and reload the model bundle on change.

.. deprecated:: Use :class:InferencePolicyReloader instead. This class only reloads the model bundle; it does not update the calibrator store or reject threshold when the policy changes.

Designed for the online inference loop: polls the file's mtime at a configurable interval and, when a change is detected, loads the new bundle. The caller only swaps to the new model after a successful load — on failure the current model is kept.

Parameters:

Name Type Description Default
models_dir Path

Directory containing active.json.

required
check_interval_s float

Minimum seconds between mtime checks.

60.0
Source code in src/taskclf/infer/resolve.py
@dataclass(eq=False)
class ActiveModelReloader:
    """Watch ``active.json`` and reload the model bundle on change.

    .. deprecated::
        Use :class:`InferencePolicyReloader` instead.  This class only
        reloads the model bundle; it does not update the calibrator
        store or reject threshold when the policy changes.

    Designed for the online inference loop: polls the file's mtime at a
    configurable interval and, when a change is detected, loads the new
    bundle.  The caller only swaps to the new model after a successful
    load — on failure the current model is kept.

    Args:
        models_dir: Directory containing ``active.json``.
        check_interval_s: Minimum seconds between mtime checks.
    """

    models_dir: Path
    check_interval_s: float = 60.0
    _active_path: Path = field(init=False)
    _last_mtime: float | None = field(init=False)
    _last_check: float = field(init=False)

    def __post_init__(self) -> None:
        self._active_path = self.models_dir / _ACTIVE_FILE
        self._last_mtime = self._current_mtime()
        self._last_check = time.monotonic()

    def _current_mtime(self) -> float | None:
        try:
            return self._active_path.stat().st_mtime
        except OSError:
            logger.debug("Could not stat %s", self._active_path, exc_info=True)
            return None

    def check_reload(
        self,
    ) -> tuple[lgb.Booster, ModelMetadata, dict[str, Any]] | None:
        """Check whether ``active.json`` changed and reload if so.

        Returns the new ``(model, metadata, cat_encoders)`` tuple when a
        reload succeeds, or ``None`` when no reload is needed or the
        reload fails (a warning is logged on failure).
        """
        now = time.monotonic()
        if now - self._last_check < self.check_interval_s:
            return None
        self._last_check = now

        mtime = self._current_mtime()
        if mtime == self._last_mtime:
            return None

        logger.info(
            "active.json changed (mtime %s -> %s), reloading", self._last_mtime, mtime
        )

        try:
            resolved = resolve_model_dir(None, self.models_dir)
            model, metadata, cat_encoders = load_model_bundle(resolved)
        except Exception:
            logger.warning(
                "Failed to reload model after active.json change; keeping current model",
                exc_info=True,
            )
            return None

        self._last_mtime = mtime
        logger.info(
            "Reloaded model from %s (schema=%s)", resolved, metadata.schema_hash
        )
        return model, metadata, cat_encoders

check_reload()

Check whether active.json changed and reload if so.

Returns the new (model, metadata, cat_encoders) tuple when a reload succeeds, or None when no reload is needed or the reload fails (a warning is logged on failure).

Source code in src/taskclf/infer/resolve.py
def check_reload(
    self,
) -> tuple[lgb.Booster, ModelMetadata, dict[str, Any]] | None:
    """Check whether ``active.json`` changed and reload if so.

    Returns the new ``(model, metadata, cat_encoders)`` tuple when a
    reload succeeds, or ``None`` when no reload is needed or the
    reload fails (a warning is logged on failure).
    """
    now = time.monotonic()
    if now - self._last_check < self.check_interval_s:
        return None
    self._last_check = now

    mtime = self._current_mtime()
    if mtime == self._last_mtime:
        return None

    logger.info(
        "active.json changed (mtime %s -> %s), reloading", self._last_mtime, mtime
    )

    try:
        resolved = resolve_model_dir(None, self.models_dir)
        model, metadata, cat_encoders = load_model_bundle(resolved)
    except Exception:
        logger.warning(
            "Failed to reload model after active.json change; keeping current model",
            exc_info=True,
        )
        return None

    self._last_mtime = mtime
    logger.info(
        "Reloaded model from %s (schema=%s)", resolved, metadata.schema_hash
    )
    return model, metadata, cat_encoders

ResolvedInferenceConfig dataclass

Fully resolved inference configuration ready for use.

Produced by :func:resolve_inference_config. Contains all loaded artifacts so callers do not need to perform additional I/O.

Source code in src/taskclf/infer/resolve.py
@dataclass(frozen=True)
class ResolvedInferenceConfig:
    """Fully resolved inference configuration ready for use.

    Produced by :func:`resolve_inference_config`.  Contains all
    loaded artifacts so callers do not need to perform additional I/O.
    """

    model: lgb.Booster
    metadata: ModelMetadata
    cat_encoders: dict[str, LabelEncoder]
    reject_threshold: float
    calibrator: Calibrator
    calibrator_store: CalibratorStore | None
    policy: InferencePolicy | None
    per_user_reject_thresholds: dict[str, float] | None = None

InferencePolicyReloader dataclass

Watch inference_policy.json and reload the full config on change.

Falls back to watching active.json when no policy file exists. Designed for the online inference loop: polls file mtimes at a configurable interval and returns a :class:ResolvedInferenceConfig when a reload is needed.

Parameters:

Name Type Description Default
models_dir Path

Directory containing policy / active pointer files.

required
check_interval_s float

Minimum seconds between mtime checks.

60.0
Source code in src/taskclf/infer/resolve.py
@dataclass(eq=False)
class InferencePolicyReloader:
    """Watch ``inference_policy.json`` and reload the full config on change.

    Falls back to watching ``active.json`` when no policy file exists.
    Designed for the online inference loop: polls file mtimes at a
    configurable interval and returns a :class:`ResolvedInferenceConfig`
    when a reload is needed.

    Args:
        models_dir: Directory containing policy / active pointer files.
        check_interval_s: Minimum seconds between mtime checks.
    """

    models_dir: Path
    check_interval_s: float = 60.0
    _policy_path: Path = field(init=False)
    _active_path: Path = field(init=False)
    _last_mtime: float | None = field(init=False)
    _last_check: float = field(init=False)

    def __post_init__(self) -> None:
        self._policy_path = self.models_dir / DEFAULT_INFERENCE_POLICY_FILE
        self._active_path = self.models_dir / _ACTIVE_FILE
        self._last_mtime = self._current_mtime()
        self._last_check = time.monotonic()

    def _watched_path(self) -> Path:
        return self._policy_path if self._policy_path.is_file() else self._active_path

    def _current_mtime(self) -> float | None:
        for path in (self._policy_path, self._active_path):
            try:
                return path.stat().st_mtime
            except OSError:
                continue
        return None

    def check_reload(self) -> ResolvedInferenceConfig | None:
        """Check whether the policy/active file changed and reload if so.

        Returns a :class:`ResolvedInferenceConfig` when a reload
        succeeds, or ``None`` when no reload is needed or the reload
        fails.
        """
        now = time.monotonic()
        if now - self._last_check < self.check_interval_s:
            return None
        self._last_check = now

        mtime = self._current_mtime()
        if mtime == self._last_mtime:
            return None

        watched = self._watched_path()
        logger.info(
            "%s changed (mtime %s -> %s), reloading",
            watched.name,
            self._last_mtime,
            mtime,
        )

        try:
            config = resolve_inference_config(self.models_dir)
        except Exception:
            logger.warning(
                "Failed to reload after %s change; keeping current config",
                watched.name,
                exc_info=True,
            )
            return None

        self._last_mtime = mtime
        return config

check_reload()

Check whether the policy/active file changed and reload if so.

Returns a :class:ResolvedInferenceConfig when a reload succeeds, or None when no reload is needed or the reload fails.

Source code in src/taskclf/infer/resolve.py
def check_reload(self) -> ResolvedInferenceConfig | None:
    """Check whether the policy/active file changed and reload if so.

    Returns a :class:`ResolvedInferenceConfig` when a reload
    succeeds, or ``None`` when no reload is needed or the reload
    fails.
    """
    now = time.monotonic()
    if now - self._last_check < self.check_interval_s:
        return None
    self._last_check = now

    mtime = self._current_mtime()
    if mtime == self._last_mtime:
        return None

    watched = self._watched_path()
    logger.info(
        "%s changed (mtime %s -> %s), reloading",
        watched.name,
        self._last_mtime,
        mtime,
    )

    try:
        config = resolve_inference_config(self.models_dir)
    except Exception:
        logger.warning(
            "Failed to reload after %s change; keeping current config",
            watched.name,
            exc_info=True,
        )
        return None

    self._last_mtime = mtime
    return config

resolve_model_dir(model_dir, models_dir, policy=None)

Resolve the model directory for inference.

.. deprecated:: Use :func:resolve_inference_config instead. This function only resolves the model bundle path; it does not load the calibrator store or reject threshold from the inference policy.

Resolution precedence:

  1. If model_dir is provided, validate that it exists and return it.
  2. Otherwise, delegate to :func:~taskclf.model_registry.resolve_active_model which reads active.json or falls back to best-model selection.
  3. If no eligible model is found, raise :class:ModelResolutionError with a descriptive message including exclusion reasons.

Parameters:

Name Type Description Default
model_dir str | None

Explicit --model-dir value from CLI, or None.

required
models_dir Path

Base directory containing promoted model bundles.

required
policy SelectionPolicy | None

Selection policy override (defaults to policy v1).

None

Returns:

Type Description
Path

Path to the resolved model bundle directory.

Raises:

Type Description
ModelResolutionError

When no model can be resolved.

Source code in src/taskclf/infer/resolve.py
def resolve_model_dir(
    model_dir: str | None,
    models_dir: Path,
    policy: SelectionPolicy | None = None,
) -> Path:
    """Resolve the model directory for inference.

    .. deprecated::
        Use :func:`resolve_inference_config` instead.  This function
        only resolves the model bundle path; it does not load the
        calibrator store or reject threshold from the inference policy.

    Resolution precedence:

    1. If *model_dir* is provided, validate that it exists and return it.
    2. Otherwise, delegate to :func:`~taskclf.model_registry.resolve_active_model`
       which reads ``active.json`` or falls back to best-model selection.
    3. If no eligible model is found, raise :class:`ModelResolutionError`
       with a descriptive message including exclusion reasons.

    Args:
        model_dir: Explicit ``--model-dir`` value from CLI, or ``None``.
        models_dir: Base directory containing promoted model bundles.
        policy: Selection policy override (defaults to policy v1).

    Returns:
        Path to the resolved model bundle directory.

    Raises:
        ModelResolutionError: When no model can be resolved.
    """
    if model_dir is not None:
        path = Path(model_dir)
        if not path.is_dir():
            raise ModelResolutionError(
                f"Explicit --model-dir does not exist: {model_dir}"
            )
        return path

    if not models_dir.is_dir():
        raise ModelResolutionError(
            f"Models directory does not exist: {models_dir}. "
            "Provide --model-dir explicitly or train a model first."
        )

    bundle, report = resolve_active_model(models_dir, policy)

    if bundle is not None:
        logger.info("Resolved model: %s", bundle.path)
        return bundle.path

    lines = [
        f"No eligible model found in {models_dir}.",
        "Provide --model-dir explicitly or train a compatible model.",
    ]
    if report is not None and report.excluded:
        lines.append("Excluded bundles:")
        for rec in report.excluded:
            lines.append(f"  - {rec.model_id}: {rec.reason}")
    raise ModelResolutionError("\n".join(lines), report=report)

resolve_inference_config(models_dir, *, model_dir_override=None, reject_threshold_override=None, calibrator_store_override=None, calibrator_path_override=None)

Resolve the full inference configuration from policy or fallback.

Resolution precedence:

  1. Explicit model_dir_override — bypasses policy; uses override flags for threshold and calibrator.
  2. models/inference_policy.json — loads model, calibrator store, and threshold from the policy. Explicit overrides still take precedence for individual fields.
  3. models/active.json + code defaults — deprecated legacy fallback.
  4. Best-model selection + code defaults — no-config fallback.

Parameters:

Name Type Description Default
models_dir Path

The models/ directory.

required
model_dir_override str | None

Explicit --model-dir value (takes highest precedence).

None
reject_threshold_override float | None

Explicit threshold that overrides the policy value.

None
calibrator_store_override Path | None

Explicit calibrator store path that overrides the policy value.

None
calibrator_path_override Path | None

Explicit single-calibrator JSON path (lowest calibrator precedence).

None

Returns:

Type Description
ResolvedInferenceConfig

A fully resolved :class:ResolvedInferenceConfig.

Raises:

Type Description
ModelResolutionError

When no model can be resolved.

Source code in src/taskclf/infer/resolve.py
def resolve_inference_config(
    models_dir: Path,
    *,
    model_dir_override: str | None = None,
    reject_threshold_override: float | None = None,
    calibrator_store_override: Path | None = None,
    calibrator_path_override: Path | None = None,
) -> ResolvedInferenceConfig:
    """Resolve the full inference configuration from policy or fallback.

    Resolution precedence:

    1. Explicit *model_dir_override* — bypasses policy; uses override
       flags for threshold and calibrator.
    2. ``models/inference_policy.json`` — loads model, calibrator store,
       and threshold from the policy.  Explicit overrides still take
       precedence for individual fields.
    3. ``models/active.json`` + code defaults — deprecated legacy
       fallback.
    4. Best-model selection + code defaults — no-config fallback.

    Args:
        models_dir: The ``models/`` directory.
        model_dir_override: Explicit ``--model-dir`` value (takes
            highest precedence).
        reject_threshold_override: Explicit threshold that overrides
            the policy value.
        calibrator_store_override: Explicit calibrator store path that
            overrides the policy value.
        calibrator_path_override: Explicit single-calibrator JSON path
            (lowest calibrator precedence).

    Returns:
        A fully resolved :class:`ResolvedInferenceConfig`.

    Raises:
        ModelResolutionError: When no model can be resolved.
    """
    policy: InferencePolicy | None = None
    model_path: Path | None = None
    reject_threshold: float = DEFAULT_REJECT_THRESHOLD
    calibrator: Calibrator = IdentityCalibrator()
    cal_store: CalibratorStore | None = None

    if model_dir_override is not None:
        model_path = Path(model_dir_override)
        if not model_path.is_dir():
            raise ModelResolutionError(
                f"Explicit --model-dir does not exist: {model_dir_override}"
            )
    else:
        policy = load_inference_policy(models_dir)
        if policy is not None:
            base = models_dir.parent
            model_path = base / policy.model_dir
            if not model_path.is_dir():
                logger.warning(
                    "Policy model_dir %s does not exist; falling back",
                    policy.model_dir,
                )
                policy = None
                model_path = None

        if model_path is None:
            if policy is not None:
                logger.warning("Policy references missing model; falling back")
                policy = None
            logger.warning(
                "No inference policy found; falling back to active.json "
                "resolution.  Create a policy with 'taskclf policy create' "
                "or 'taskclf train tune-reject --write-policy'."
            )
            model_path = resolve_model_dir(None, models_dir)

    per_user_thresholds: dict[str, float] | None = None
    if policy is not None:
        reject_threshold = policy.reject_threshold
        per_user_thresholds = policy.per_user_reject_thresholds

    # Load calibrator store
    if calibrator_store_override is not None:
        cal_store = load_calibrator_store(calibrator_store_override)
    elif policy is not None and policy.calibrator_store_dir is not None:
        store_path = models_dir.parent / policy.calibrator_store_dir
        if store_path.is_dir():
            cal_store = load_calibrator_store(store_path)
        else:
            logger.warning(
                "Policy calibrator_store_dir %s does not exist; using identity",
                policy.calibrator_store_dir,
            )

    if calibrator_path_override is not None and cal_store is None:
        calibrator = load_calibrator(calibrator_path_override)

    if reject_threshold_override is not None:
        reject_threshold = reject_threshold_override

    model, metadata, cat_encoders = load_model_bundle(model_path)
    logger.info(
        "Resolved inference config: model=%s schema=%s threshold=%.4f "
        "calibrator_store=%s policy=%s",
        model_path.name,
        metadata.schema_hash,
        reject_threshold,
        "yes" if cal_store is not None else "no",
        "yes" if policy is not None else "legacy",
    )

    return ResolvedInferenceConfig(
        model=model,
        metadata=metadata,
        cat_encoders=cat_encoders,
        reject_threshold=reject_threshold,
        calibrator=calibrator,
        calibrator_store=cal_store,
        policy=policy,
        per_user_reject_thresholds=per_user_thresholds,
    )