Skip to content

model_registry

Model registry: scan, validate, rank, filter, and activate model bundles.

Provides a pure, testable API for discovering promoted model bundles, checking compatibility with the current schema and label set, ranking candidates by the selection policy, and managing the active model pointer.

BundleMetrics Fields

Field Type Description
macro_f1 float Macro-averaged F1 across all classes
weighted_f1 float Support-weighted F1 across all classes
confusion_matrix list[list[int]] NxN confusion matrix
label_names list[str] Label ordering for confusion matrix rows/cols

ModelBundle Fields

Field Type Description
model_id str Bundle directory name
path Path Absolute path to the bundle directory
valid bool False if parsing or validation failed
invalid_reason str or None Why the bundle is invalid (if applicable)
metadata ModelMetadata or None Parsed metadata.json
metrics BundleMetrics or None Parsed metrics.json
created_at datetime or None Parsed metadata.created_at timestamp

SelectionPolicy Fields

Field Type Description
version int Policy version (default 1)
min_improvement float Hysteresis threshold: candidate must exceed current active macro_f1 by this amount to trigger a switch (default 0.0 — disabled)

ExclusionRecord Fields

Field Type Description
model_id str Bundle directory name
path Path Path to the excluded bundle directory
reason str Human-readable exclusion reason (e.g. "invalid: missing metrics.json", "incompatible: schema_hash mismatch")

SelectionReport Fields

Field Type Description
best ModelBundle or None Highest-ranked eligible bundle, or None if no bundle qualifies
ranked list[ModelBundle] Eligible bundles in score-descending order (best first)
excluded list[ExclusionRecord] Every bundle that was filtered out, with reason
policy SelectionPolicy Policy used for this selection
required_schema_hash str Schema hash that bundles were required to match

ActivePointer Fields

Field Type Description
model_dir str Relative path to the bundle directory (e.g. "models/best_bundle")
selected_at str ISO8601 UTC timestamp of when the pointer was written
policy_version int Selection policy version used to choose this bundle
model_id str or None Optional stable model identifier
reason dict or None Optional structured reason including ranking metrics

ActiveHistoryEntry Fields

Field Type Description
at str ISO8601 timestamp of the transition
old ActivePointer or None Previous pointer (None on first activation)
new ActivePointer New pointer

IndexCacheBundleSummary Fields

Field Type Description
model_id str Bundle directory name
path str Path to the bundle directory
macro_f1 float or None Macro-averaged F1
weighted_f1 float or None Support-weighted F1
created_at str or None ISO8601 creation timestamp
eligible bool Whether the bundle was eligible for selection

IndexCache Fields

Field Type Description
generated_at str ISO8601 timestamp when the cache was written
schema_hash str Runtime schema hash used during the scan
policy_version int Selection policy version
ranked list[IndexCacheBundleSummary] Eligible bundles in score-descending order
excluded list[ExclusionRecord] Every bundle that was filtered out, with reason
best_model_id str or None Model ID of the highest-ranked bundle

Functions

  • list_bundles(models_dir) — scan a directory for bundle subdirectories; returns valid and invalid bundles sorted by model_id.
  • is_compatible(bundle, required_schema_hash, required_label_set) — check schema hash + label set match.
  • passes_constraints(bundle, policy) — hard constraint gate (policy v1: valid bundle with metrics).
  • score(bundle, policy) — sortable ranking tuple (macro_f1, weighted_f1, created_at).
  • find_best_model(models_dir, policy, required_schema_hash, required_label_set) — scan, filter, rank, and select the best bundle; returns a SelectionReport.
  • read_active(models_dir) — read active.json pointer; returns ActivePointer or None if missing/invalid.
  • write_active_atomic(models_dir, bundle, policy, reason) — atomically write active.json and append to active_history.jsonl; returns the new ActivePointer.
  • append_active_history(models_dir, old, new) — append a transition record to active_history.jsonl.
  • resolve_active_model(models_dir, policy, required_schema_hash, required_label_set) — resolve the active bundle: uses the pointer if valid, falls back to find_best_model and self-heals the pointer; returns (ModelBundle | None, SelectionReport | None).
  • write_index_cache(models_dir, report) — atomically write models/index.json from a SelectionReport; returns the IndexCache.
  • read_index_cache(models_dir) — read cached index.json; returns IndexCache or None if missing/invalid.
  • should_switch_active(current, candidate, policy) — hysteresis check: returns True if the candidate's macro_f1 exceeds the current active model's macro_f1 by at least policy.min_improvement.

taskclf.model_registry

Model registry: scan, validate, rank, filter, and activate model bundles.

Provides a pure, testable API for discovering promoted model bundles under models/, checking compatibility with the current schema and label set, ranking candidates by the selection policy, and managing the active model pointer.

Public surface:

  • :class:BundleMetrics — parsed metrics.json
  • :class:ModelBundle — one scanned bundle (valid or invalid)
  • :class:SelectionPolicy — ranking / constraint configuration
  • :class:ExclusionRecord — why a bundle was excluded from selection
  • :class:SelectionReport — full result of :func:find_best_model
  • :class:ActivePointer — persisted active.json pointer
  • :class:ActiveHistoryEntry — one line in active_history.jsonl
  • :class:IndexCacheBundleSummary — one bundle row in index.json
  • :class:IndexCache — cached scan/ranking snapshot
  • :func:list_bundles — scan a directory for bundles
  • :func:is_compatible — schema hash + label set gate
  • :func:passes_constraints — hard constraint gate (policy v1: no-op)
  • :func:score — sortable ranking tuple
  • :func:find_best_model — scan, filter, rank, and select the best bundle
  • :func:read_active — read active.json pointer
  • :func:write_active_atomic — atomically update active.json
  • :func:append_active_history — append to active_history.jsonl
  • :func:resolve_active_model — resolve active bundle with fallback
  • :func:write_index_cache — write index.json from a selection report
  • :func:read_index_cache — read cached index.json
  • :func:should_switch_active — hysteresis check before switching active

BundleMetrics

Bases: BaseModel

Metrics stored in a bundle's metrics.json.

See docs/guide/metrics_contract.md for the stable contract.

Source code in src/taskclf/model_registry.py
class BundleMetrics(BaseModel, frozen=True):
    """Metrics stored in a bundle's ``metrics.json``.

    See ``docs/guide/metrics_contract.md`` for the stable contract.
    """

    macro_f1: float
    weighted_f1: float
    confusion_matrix: list[list[int]]
    label_names: list[str]

ModelBundle

Bases: BaseModel

A scanned model bundle directory.

Both valid and invalid bundles are represented; check :attr:valid before using :attr:metadata or :attr:metrics.

Source code in src/taskclf/model_registry.py
class ModelBundle(BaseModel, frozen=True):
    """A scanned model bundle directory.

    Both valid and invalid bundles are represented; check :attr:`valid`
    before using :attr:`metadata` or :attr:`metrics`.
    """

    model_id: str
    path: Path
    valid: bool
    invalid_reason: str | None = None
    metadata: ModelMetadata | None = None
    metrics: BundleMetrics | None = None
    created_at: datetime | None = None

SelectionPolicy

Bases: BaseModel

Selection policy configuration.

Policy v1 ranks by macro_f1 desc, weighted_f1 desc, created_at desc and applies no additional hard constraints (acceptance gates are enforced at promotion time by retrain).

min_improvement controls hysteresis: the candidate must exceed the current active model's macro_f1 by at least this amount before the active pointer is switched. Set to 0.0 (default) to disable hysteresis.

Source code in src/taskclf/model_registry.py
class SelectionPolicy(BaseModel, frozen=True):
    """Selection policy configuration.

    Policy v1 ranks by ``macro_f1`` desc, ``weighted_f1`` desc,
    ``created_at`` desc and applies no additional hard constraints
    (acceptance gates are enforced at promotion time by retrain).

    ``min_improvement`` controls hysteresis: the candidate must exceed
    the current active model's ``macro_f1`` by at least this amount
    before the active pointer is switched.  Set to ``0.0`` (default)
    to disable hysteresis.
    """

    version: int = 1
    min_improvement: float = 0.0

ExclusionRecord

Bases: BaseModel

Why a single bundle was excluded during :func:find_best_model.

Source code in src/taskclf/model_registry.py
class ExclusionRecord(BaseModel, frozen=True):
    """Why a single bundle was excluded during :func:`find_best_model`."""

    model_id: str
    path: Path
    reason: str

SelectionReport

Bases: BaseModel

Full result of :func:find_best_model.

ranked contains eligible bundles in score-descending order. best is ranked[0] when the list is non-empty, else None. excluded lists every bundle that was filtered out, with a human-readable reason.

Source code in src/taskclf/model_registry.py
class SelectionReport(BaseModel, frozen=True):
    """Full result of :func:`find_best_model`.

    *ranked* contains eligible bundles in score-descending order.
    *best* is ``ranked[0]`` when the list is non-empty, else ``None``.
    *excluded* lists every bundle that was filtered out, with a
    human-readable reason.
    """

    best: ModelBundle | None
    ranked: list[ModelBundle]
    excluded: list[ExclusionRecord]
    policy: SelectionPolicy
    required_schema_hash: str

ActivePointer

Bases: BaseModel

Persisted pointer to the currently active model bundle.

Stored as models/active.json. See docs/guide/model_selection.md for the schema contract.

Source code in src/taskclf/model_registry.py
class ActivePointer(BaseModel, frozen=True):
    """Persisted pointer to the currently active model bundle.

    Stored as ``models/active.json``.  See
    ``docs/guide/model_selection.md`` for the schema contract.
    """

    model_dir: str
    selected_at: str
    policy_version: int
    model_id: str | None = None
    reason: dict[str, object] | None = None

ActiveHistoryEntry

Bases: BaseModel

One line in models/active_history.jsonl.

Records every change to active.json for auditability and rollback.

Source code in src/taskclf/model_registry.py
class ActiveHistoryEntry(BaseModel, frozen=True):
    """One line in ``models/active_history.jsonl``.

    Records every change to ``active.json`` for auditability and
    rollback.
    """

    at: str
    old: ActivePointer | None
    new: ActivePointer

IndexCacheBundleSummary

Bases: BaseModel

Summary of one bundle stored inside :class:IndexCache.

Source code in src/taskclf/model_registry.py
class IndexCacheBundleSummary(BaseModel, frozen=True):
    """Summary of one bundle stored inside :class:`IndexCache`."""

    model_id: str
    path: str
    macro_f1: float | None = None
    weighted_f1: float | None = None
    created_at: str | None = None
    eligible: bool = False

IndexCache

Bases: BaseModel

Cached scan/ranking snapshot written to models/index.json.

This is an informational cache — selection never reads it. Operators and taskclf train list may consume it for fast inspection without a full rescan.

Source code in src/taskclf/model_registry.py
class IndexCache(BaseModel, frozen=True):
    """Cached scan/ranking snapshot written to ``models/index.json``.

    This is an informational cache — selection never reads it.
    Operators and ``taskclf train list`` may consume it for fast
    inspection without a full rescan.
    """

    generated_at: str
    schema_hash: str
    policy_version: int
    ranked: list[IndexCacheBundleSummary]
    excluded: list[ExclusionRecord]
    best_model_id: str | None = None

list_bundles(models_dir)

Scan models_dir for model bundle subdirectories.

Each subdirectory is parsed independently; failures are captured as invalid bundles rather than aborting the scan.

Parameters:

Name Type Description Default
models_dir Path

Parent directory containing bundle subdirectories (e.g. Path("models")).

required

Returns:

Type Description
list[ModelBundle]

A list of :class:ModelBundle instances sorted by model_id

list[ModelBundle]

for deterministic ordering.

Source code in src/taskclf/model_registry.py
def list_bundles(models_dir: Path) -> list[ModelBundle]:
    """Scan *models_dir* for model bundle subdirectories.

    Each subdirectory is parsed independently; failures are captured as
    invalid bundles rather than aborting the scan.

    Args:
        models_dir: Parent directory containing bundle subdirectories
            (e.g. ``Path("models")``).

    Returns:
        A list of :class:`ModelBundle` instances sorted by ``model_id``
        for deterministic ordering.
    """
    if not models_dir.is_dir():
        return []

    bundles: list[ModelBundle] = []
    for candidate in models_dir.iterdir():
        if not candidate.is_dir():
            continue
        bundle = _parse_bundle(candidate)
        bundles.append(bundle)

    bundles.sort(key=lambda b: b.model_id)
    return bundles

is_compatible(bundle, required_schema_hash=get_feature_schema('v1').SCHEMA_HASH, required_label_set=LABEL_SET_V1)

Check whether bundle is compatible with the current runtime.

A bundle is compatible when both hold:

  1. metadata.schema_hash exactly matches required_schema_hash.
  2. sorted(metadata.label_set) exactly matches sorted(required_label_set).

Parameters:

Name Type Description Default
bundle ModelBundle

A scanned model bundle.

required
required_schema_hash str

Expected schema hash (defaults to FeatureSchemaV1.SCHEMA_HASH).

SCHEMA_HASH
required_label_set frozenset[str]

Expected label vocabulary (defaults to LABEL_SET_V1).

LABEL_SET_V1

Returns:

Type Description
bool

True if the bundle is valid and compatible.

Source code in src/taskclf/model_registry.py
def is_compatible(
    bundle: ModelBundle,
    required_schema_hash: str = get_feature_schema("v1").SCHEMA_HASH,
    required_label_set: frozenset[str] = LABEL_SET_V1,
) -> bool:
    """Check whether *bundle* is compatible with the current runtime.

    A bundle is compatible when both hold:

    1. ``metadata.schema_hash`` exactly matches *required_schema_hash*.
    2. ``sorted(metadata.label_set)`` exactly matches ``sorted(required_label_set)``.

    Args:
        bundle: A scanned model bundle.
        required_schema_hash: Expected schema hash (defaults to
            ``FeatureSchemaV1.SCHEMA_HASH``).
        required_label_set: Expected label vocabulary (defaults to
            ``LABEL_SET_V1``).

    Returns:
        ``True`` if the bundle is valid and compatible.
    """
    if not bundle.valid or bundle.metadata is None:
        return False
    if bundle.metadata.schema_hash != required_schema_hash:
        return False
    if sorted(bundle.metadata.label_set) != sorted(required_label_set):
        return False
    return True

passes_constraints(bundle, policy)

Check whether bundle passes the hard constraints of policy.

Policy v1 applies no additional constraints beyond validity: all acceptance gates are enforced at promotion time by retrain, so any promoted bundle that parsed successfully is eligible for ranking.

Parameters:

Name Type Description Default
bundle ModelBundle

A scanned model bundle (must be valid).

required
policy SelectionPolicy

Selection policy configuration.

required

Returns:

Type Description
bool

True if the bundle is valid with non-null metrics.

Source code in src/taskclf/model_registry.py
def passes_constraints(
    bundle: ModelBundle,
    policy: SelectionPolicy,  # noqa: ARG001 — reserved for future policy versions
) -> bool:
    """Check whether *bundle* passes the hard constraints of *policy*.

    Policy v1 applies no additional constraints beyond validity: all
    acceptance gates are enforced at promotion time by retrain, so any
    promoted bundle that parsed successfully is eligible for ranking.

    Args:
        bundle: A scanned model bundle (must be valid).
        policy: Selection policy configuration.

    Returns:
        ``True`` if the bundle is valid with non-null metrics.
    """
    return bundle.valid and bundle.metrics is not None

score(bundle, policy)

Compute a sortable ranking key for bundle.

The tuple sorts descending on all three components:

  1. macro_f1 (higher is better)
  2. weighted_f1 (tie-break; higher is better)
  3. created_at (tie-break; newer is better — ISO8601 strings with the same UTC offset sort lexicographically)

Parameters:

Name Type Description Default
bundle ModelBundle

A valid model bundle with non-null metrics and metadata.

required
policy SelectionPolicy

Selection policy configuration.

required

Returns:

Type Description
float

A 3-tuple (macro_f1, weighted_f1, created_at_raw) suitable

float

for sorted(..., reverse=True).

Raises:

Type Description
ValueError

If the bundle is invalid or missing metrics/metadata.

Source code in src/taskclf/model_registry.py
def score(
    bundle: ModelBundle,
    policy: SelectionPolicy,  # noqa: ARG001 — reserved for future policy versions
) -> tuple[float, float, str]:
    """Compute a sortable ranking key for *bundle*.

    The tuple sorts **descending** on all three components:

    1. ``macro_f1`` (higher is better)
    2. ``weighted_f1`` (tie-break; higher is better)
    3. ``created_at`` (tie-break; newer is better — ISO8601 strings
       with the same UTC offset sort lexicographically)

    Args:
        bundle: A valid model bundle with non-null metrics and metadata.
        policy: Selection policy configuration.

    Returns:
        A 3-tuple ``(macro_f1, weighted_f1, created_at_raw)`` suitable
        for ``sorted(..., reverse=True)``.

    Raises:
        ValueError: If the bundle is invalid or missing metrics/metadata.
    """
    if not bundle.valid or bundle.metrics is None or bundle.metadata is None:
        raise ValueError(f"Cannot score invalid bundle {bundle.model_id!r}")
    return (
        bundle.metrics.macro_f1,
        bundle.metrics.weighted_f1,
        bundle.metadata.created_at,
    )

find_best_model(models_dir, policy=None, required_schema_hash=None, required_label_set=None)

Scan, filter, rank, and select the best model bundle.

This is the main entry-point for non-mutating model selection. It composes :func:list_bundles, :func:is_compatible, :func:passes_constraints, and :func:score into a single call that returns a structured :class:SelectionReport.

Parameters:

Name Type Description Default
models_dir Path

Directory containing promoted model bundle subdirectories (e.g. Path("models")).

required
policy SelectionPolicy | None

Selection policy configuration. Defaults to SelectionPolicy() (policy v1).

None
required_schema_hash str | None

Schema hash that bundles must match. Defaults to FeatureSchemaV1.SCHEMA_HASH.

None
required_label_set frozenset[str] | None

Label vocabulary that bundles must match. Defaults to LABEL_SET_V1.

None

Returns:

Name Type Description
A SelectionReport

class:SelectionReport with the best bundle (if any),

SelectionReport

the full ranked list of eligible bundles, and exclusion

SelectionReport

records for every bundle that was filtered out.

Source code in src/taskclf/model_registry.py
def find_best_model(
    models_dir: Path,
    policy: SelectionPolicy | None = None,
    required_schema_hash: str | None = None,
    required_label_set: frozenset[str] | None = None,
) -> SelectionReport:
    """Scan, filter, rank, and select the best model bundle.

    This is the main entry-point for non-mutating model selection.
    It composes :func:`list_bundles`, :func:`is_compatible`,
    :func:`passes_constraints`, and :func:`score` into a single call
    that returns a structured :class:`SelectionReport`.

    Args:
        models_dir: Directory containing promoted model bundle
            subdirectories (e.g. ``Path("models")``).
        policy: Selection policy configuration.  Defaults to
            ``SelectionPolicy()`` (policy v1).
        required_schema_hash: Schema hash that bundles must match.
            Defaults to ``FeatureSchemaV1.SCHEMA_HASH``.
        required_label_set: Label vocabulary that bundles must match.
            Defaults to ``LABEL_SET_V1``.

    Returns:
        A :class:`SelectionReport` with the best bundle (if any),
        the full ranked list of eligible bundles, and exclusion
        records for every bundle that was filtered out.
    """
    explicit_schema_hash = required_schema_hash is not None

    if policy is None:
        policy = SelectionPolicy()
    if required_label_set is None:
        required_label_set = LABEL_SET_V1
    if required_schema_hash is not None:
        schema_hashes: tuple[str, ...] = (required_schema_hash,)
    else:
        schema_hashes = tuple(
            get_feature_schema(version).SCHEMA_HASH
            for version in iter_feature_schema_versions(LATEST_FEATURE_SCHEMA_VERSION)
        )

    last_report: SelectionReport | None = None
    for candidate_schema_hash in schema_hashes:
        bundles = list_bundles(models_dir)

        excluded: list[ExclusionRecord] = []
        eligible: list[ModelBundle] = []

        for bundle in bundles:
            if not bundle.valid:
                excluded.append(
                    ExclusionRecord(
                        model_id=bundle.model_id,
                        path=bundle.path,
                        reason=f"invalid: {bundle.invalid_reason}",
                    )
                )
                continue

            if not is_compatible(bundle, candidate_schema_hash, required_label_set):
                assert bundle.metadata is not None
                if bundle.metadata.schema_hash != candidate_schema_hash:
                    detail = "schema_hash mismatch"
                else:
                    detail = "label_set mismatch"
                excluded.append(
                    ExclusionRecord(
                        model_id=bundle.model_id,
                        path=bundle.path,
                        reason=f"incompatible: {detail}",
                    )
                )
                continue

            if not passes_constraints(bundle, policy):
                excluded.append(
                    ExclusionRecord(
                        model_id=bundle.model_id,
                        path=bundle.path,
                        reason="constraint: failed policy constraints",
                    )
                )
                continue

            eligible.append(bundle)

        ranked = sorted(
            eligible,
            key=lambda b: score(b, policy),
            reverse=True,
        )

        last_report = SelectionReport(
            best=ranked[0] if ranked else None,
            ranked=ranked,
            excluded=excluded,
            policy=policy,
            required_schema_hash=candidate_schema_hash,
        )
        if ranked or explicit_schema_hash:
            return last_report

    assert last_report is not None
    return last_report

read_active(models_dir)

Read the active model pointer from models_dir/active.json.

Returns None (without raising) when the file is missing, contains invalid JSON, or fails :class:ActivePointer validation. A warning is logged on parse/validation failures so operators can notice stale pointer files.

Source code in src/taskclf/model_registry.py
def read_active(models_dir: Path) -> ActivePointer | None:
    """Read the active model pointer from ``models_dir/active.json``.

    Returns ``None`` (without raising) when the file is missing,
    contains invalid JSON, or fails :class:`ActivePointer` validation.
    A warning is logged on parse/validation failures so operators can
    notice stale pointer files.
    """
    active_path = models_dir / _ACTIVE_FILE
    if not active_path.is_file():
        return None

    try:
        raw = json.loads(active_path.read_text())
    except (json.JSONDecodeError, OSError) as exc:
        logger.warning("active.json parse error: %s", exc)
        return None

    try:
        return ActivePointer.model_validate(raw)
    except Exception as exc:
        logger.warning("active.json validation error: %s", exc)
        return None

append_active_history(models_dir, old, new)

Append a transition record to models_dir/active_history.jsonl.

Creates the file if it does not exist. Each line is a self-contained JSON object matching :class:ActiveHistoryEntry.

Source code in src/taskclf/model_registry.py
def append_active_history(
    models_dir: Path,
    old: ActivePointer | None,
    new: ActivePointer,
) -> None:
    """Append a transition record to ``models_dir/active_history.jsonl``.

    Creates the file if it does not exist.  Each line is a
    self-contained JSON object matching :class:`ActiveHistoryEntry`.
    """
    entry = ActiveHistoryEntry(at=new.selected_at, old=old, new=new)
    history_path = models_dir / _HISTORY_FILE
    with history_path.open("a") as fh:
        fh.write(entry.model_dump_json() + "\n")

write_active_atomic(models_dir, bundle, policy, reason=None)

Atomically write models_dir/active.json for bundle.

The pointer is written to a temporary file first, then moved into place with :func:os.replace to guarantee readers never see a partial write. The previous pointer (if any) is read before the overwrite and both old and new are appended to the audit log via :func:append_active_history.

Parameters:

Name Type Description Default
models_dir Path

The models/ directory.

required
bundle ModelBundle

The bundle to activate (must be valid with metrics).

required
policy SelectionPolicy

The selection policy used to choose this bundle.

required
reason str | None

Optional human-readable reason string.

None

Returns:

Type Description
ActivePointer

The newly written :class:ActivePointer.

Source code in src/taskclf/model_registry.py
def write_active_atomic(
    models_dir: Path,
    bundle: ModelBundle,
    policy: SelectionPolicy,
    reason: str | None = None,
) -> ActivePointer:
    """Atomically write ``models_dir/active.json`` for *bundle*.

    The pointer is written to a temporary file first, then moved into
    place with :func:`os.replace` to guarantee readers never see a
    partial write.  The previous pointer (if any) is read before the
    overwrite and both old and new are appended to the audit log via
    :func:`append_active_history`.

    Args:
        models_dir: The ``models/`` directory.
        bundle: The bundle to activate (must be valid with metrics).
        policy: The selection policy used to choose this bundle.
        reason: Optional human-readable reason string.

    Returns:
        The newly written :class:`ActivePointer`.
    """
    old = read_active(models_dir)

    reason_dict: dict[str, object] | None = None
    if bundle.metrics is not None:
        reason_dict = {
            "metric": "macro_f1",
            "macro_f1": bundle.metrics.macro_f1,
            "weighted_f1": bundle.metrics.weighted_f1,
        }
        if reason is not None:
            reason_dict["note"] = reason

    model_dir = str(bundle.path.relative_to(models_dir.parent))

    pointer = ActivePointer(
        model_dir=model_dir,
        model_id=bundle.model_id,
        selected_at=datetime.now(UTC).isoformat(),
        policy_version=policy.version,
        reason=reason_dict,
    )

    tmp_path = models_dir / _ACTIVE_TMP
    final_path = models_dir / _ACTIVE_FILE

    tmp_path.write_text(pointer.model_dump_json(indent=2) + "\n")
    os.replace(tmp_path, final_path)

    append_active_history(models_dir, old, pointer)

    return pointer

resolve_active_model(models_dir, policy=None, required_schema_hash=None, required_label_set=None)

Resolve the active model bundle, falling back to selection.

Resolution order:

  1. Read active.json. If valid and the pointed-to bundle exists, is parseable, and is compatible — return it immediately (no full scan).
  2. Otherwise fall back to :func:find_best_model. If a best bundle is found, atomically update active.json to self-heal the pointer.

Returns:

Type Description
ModelBundle | None

A 2-tuple (bundle, report). report is None when the

SelectionReport | None

pointer was valid and no scan was needed.

Source code in src/taskclf/model_registry.py
def resolve_active_model(
    models_dir: Path,
    policy: SelectionPolicy | None = None,
    required_schema_hash: str | None = None,
    required_label_set: frozenset[str] | None = None,
) -> tuple[ModelBundle | None, SelectionReport | None]:
    """Resolve the active model bundle, falling back to selection.

    Resolution order:

    1. Read ``active.json``.  If valid and the pointed-to bundle
       exists, is parseable, and is compatible — return it immediately
       (no full scan).
    2. Otherwise fall back to :func:`find_best_model`.  If a best
       bundle is found, atomically update ``active.json`` to self-heal
       the pointer.

    Returns:
        A 2-tuple ``(bundle, report)``.  *report* is ``None`` when the
        pointer was valid and no scan was needed.
    """
    if policy is None:
        policy = SelectionPolicy()
    if required_label_set is None:
        required_label_set = LABEL_SET_V1

    pointer = read_active(models_dir)
    if pointer is not None:
        bundle_path = models_dir.parent / pointer.model_dir
        if bundle_path.is_dir():
            bundle = _parse_bundle(bundle_path)
            metadata = bundle.metadata
            if bundle.valid and metadata is not None:
                if required_schema_hash is None:
                    if _matches_declared_schema(bundle) and sorted(
                        metadata.label_set
                    ) == sorted(required_label_set):
                        return bundle, None
                elif is_compatible(bundle, required_schema_hash, required_label_set):
                    return bundle, None
            logger.warning(
                "active.json points to invalid/incompatible bundle %s; "
                "falling back to selection",
                pointer.model_dir,
            )
        else:
            logger.warning(
                "active.json points to missing directory %s; falling back to selection",
                pointer.model_dir,
            )

    report = find_best_model(
        models_dir,
        policy,
        required_schema_hash,
        required_label_set,
    )

    if report.best is not None:
        write_active_atomic(models_dir, report.best, policy, reason="auto-repair")

    return report.best, report

write_index_cache(models_dir, report)

Write models_dir/index.json from a :class:SelectionReport.

The cache is written atomically (temp + :func:os.replace). It is informational only — :func:find_best_model never reads it.

Parameters:

Name Type Description Default
models_dir Path

The models/ directory.

required
report SelectionReport

A completed selection report.

required

Returns:

Name Type Description
The IndexCache

class:IndexCache that was persisted.

Source code in src/taskclf/model_registry.py
def write_index_cache(
    models_dir: Path,
    report: SelectionReport,
) -> IndexCache:
    """Write ``models_dir/index.json`` from a :class:`SelectionReport`.

    The cache is written atomically (temp + :func:`os.replace`).  It is
    informational only — :func:`find_best_model` never reads it.

    Args:
        models_dir: The ``models/`` directory.
        report: A completed selection report.

    Returns:
        The :class:`IndexCache` that was persisted.
    """
    ranked_summaries: list[IndexCacheBundleSummary] = []
    for b in report.ranked:
        ranked_summaries.append(
            IndexCacheBundleSummary(
                model_id=b.model_id,
                path=str(b.path),
                macro_f1=b.metrics.macro_f1 if b.metrics else None,
                weighted_f1=b.metrics.weighted_f1 if b.metrics else None,
                created_at=b.metadata.created_at if b.metadata else None,
                eligible=True,
            )
        )

    cache = IndexCache(
        generated_at=datetime.now(UTC).isoformat(),
        schema_hash=report.required_schema_hash,
        policy_version=report.policy.version,
        ranked=ranked_summaries,
        excluded=report.excluded,
        best_model_id=report.best.model_id if report.best else None,
    )

    tmp_path = models_dir / _INDEX_TMP
    final_path = models_dir / _INDEX_FILE

    tmp_path.write_text(cache.model_dump_json(indent=2) + "\n")
    os.replace(tmp_path, final_path)

    return cache

read_index_cache(models_dir)

Read the cached index from models_dir/index.json.

Returns None (without raising) when the file is missing, contains invalid JSON, or fails :class:IndexCache validation.

Source code in src/taskclf/model_registry.py
def read_index_cache(models_dir: Path) -> IndexCache | None:
    """Read the cached index from ``models_dir/index.json``.

    Returns ``None`` (without raising) when the file is missing,
    contains invalid JSON, or fails :class:`IndexCache` validation.
    """
    index_path = models_dir / _INDEX_FILE
    if not index_path.is_file():
        return None

    try:
        raw = json.loads(index_path.read_text())
    except (json.JSONDecodeError, OSError) as exc:
        logger.warning("index.json parse error: %s", exc)
        return None

    try:
        return IndexCache.model_validate(raw)
    except Exception as exc:
        logger.warning("index.json validation error: %s", exc)
        return None

should_switch_active(current, candidate, policy)

Decide whether candidate should replace current as active.

When policy.min_improvement is positive, the candidate's macro_f1 must exceed the current active model's macro_f1 by at least that amount. If current is None or has no recorded macro_f1, the switch is always allowed.

Parameters:

Name Type Description Default
current ActivePointer | None

The current active pointer (may be None).

required
candidate ModelBundle

The best-ranked bundle from selection.

required
policy SelectionPolicy

Selection policy with hysteresis threshold.

required

Returns:

Type Description
bool

True if the active pointer should be updated.

Source code in src/taskclf/model_registry.py
def should_switch_active(
    current: ActivePointer | None,
    candidate: ModelBundle,
    policy: SelectionPolicy,
) -> bool:
    """Decide whether *candidate* should replace *current* as active.

    When ``policy.min_improvement`` is positive, the candidate's
    ``macro_f1`` must exceed the current active model's ``macro_f1``
    by at least that amount.  If ``current`` is ``None`` or has no
    recorded ``macro_f1``, the switch is always allowed.

    Args:
        current: The current active pointer (may be ``None``).
        candidate: The best-ranked bundle from selection.
        policy: Selection policy with hysteresis threshold.

    Returns:
        ``True`` if the active pointer should be updated.
    """
    if current is None or policy.min_improvement <= 0.0:
        return True

    if candidate.metrics is None:
        return False

    current_f1: float | None = None
    if current.reason and "macro_f1" in current.reason:
        try:
            current_f1 = float(current.reason["macro_f1"])  # type: ignore[arg-type]
        except TypeError, ValueError:
            logger.debug(
                "Could not parse macro_f1 from active pointer reason", exc_info=True
            )

    if current_f1 is None:
        return True

    return candidate.metrics.macro_f1 >= current_f1 + policy.min_improvement