model_registry¶

Model registry: scan, validate, rank, filter, and activate model bundles.

Provides a pure, testable API for discovering promoted model bundles, checking compatibility with the current schema and label set, ranking candidates by the selection policy, and managing the active model pointer.

BundleMetrics Fields¶

Field	Type	Description
`macro_f1`	float	Macro-averaged F1 across all classes
`weighted_f1`	float	Support-weighted F1 across all classes
`confusion_matrix`	list[list[int]]	NxN confusion matrix
`label_names`	list[str]	Label ordering for confusion matrix rows/cols

ModelBundle Fields¶

Field	Type	Description
`model_id`	str	Bundle directory name
`path`	Path	Absolute path to the bundle directory
`valid`	bool	`False` if parsing or validation failed
`invalid_reason`	str or None	Why the bundle is invalid (if applicable)
`metadata`	ModelMetadata or None	Parsed `metadata.json`
`metrics`	BundleMetrics or None	Parsed `metrics.json`
`created_at`	datetime or None	Parsed `metadata.created_at` timestamp

SelectionPolicy Fields¶

Field	Type	Description
`version`	int	Policy version (default `1`)
`min_improvement`	float	Hysteresis threshold: candidate must exceed current active macro_f1 by this amount to trigger a switch (default `0.0` — disabled)

ExclusionRecord Fields¶

Field	Type	Description
`model_id`	str	Bundle directory name
`path`	Path	Path to the excluded bundle directory
`reason`	str	Human-readable exclusion reason (e.g. `"invalid: missing metrics.json"`, `"incompatible: schema_hash mismatch"`)

SelectionReport Fields¶

Field	Type	Description
`best`	ModelBundle or None	Highest-ranked eligible bundle, or `None` if no bundle qualifies
`ranked`	list[ModelBundle]	Eligible bundles in score-descending order (best first)
`excluded`	list[ExclusionRecord]	Every bundle that was filtered out, with reason
`policy`	SelectionPolicy	Policy used for this selection
`required_schema_hash`	str	Schema hash that bundles were required to match

ActivePointer Fields¶

Field	Type	Description
`model_dir`	str	Relative path to the bundle directory (e.g. `"models/best_bundle"`)
`selected_at`	str	ISO8601 UTC timestamp of when the pointer was written
`policy_version`	int	Selection policy version used to choose this bundle
`model_id`	str or None	Optional stable model identifier
`reason`	dict or None	Optional structured reason including ranking metrics

ActiveHistoryEntry Fields¶

Field	Type	Description
`at`	str	ISO8601 timestamp of the transition
`old`	ActivePointer or None	Previous pointer (`None` on first activation)
`new`	ActivePointer	New pointer

IndexCacheBundleSummary Fields¶

Field	Type	Description
`model_id`	str	Bundle directory name
`path`	str	Path to the bundle directory
`macro_f1`	float or None	Macro-averaged F1
`weighted_f1`	float or None	Support-weighted F1
`created_at`	str or None	ISO8601 creation timestamp
`eligible`	bool	Whether the bundle was eligible for selection

IndexCache Fields¶

Field	Type	Description
`generated_at`	str	ISO8601 timestamp when the cache was written
`schema_hash`	str	Runtime schema hash used during the scan
`policy_version`	int	Selection policy version
`ranked`	list[IndexCacheBundleSummary]	Eligible bundles in score-descending order
`excluded`	list[ExclusionRecord]	Every bundle that was filtered out, with reason
`best_model_id`	str or None	Model ID of the highest-ranked bundle

Functions¶

list_bundles(models_dir) — scan a directory for bundle subdirectories; returns valid and invalid bundles sorted by model_id.
is_compatible(bundle, required_schema_hash, required_label_set) — check schema hash + label set match.
passes_constraints(bundle, policy) — hard constraint gate (policy v1: valid bundle with metrics).
score(bundle, policy) — sortable ranking tuple (macro_f1, weighted_f1, created_at).
find_best_model(models_dir, policy, required_schema_hash, required_label_set) — scan, filter, rank, and select the best bundle; returns a SelectionReport.
read_active(models_dir) — read active.json pointer; returns ActivePointer or None if missing/invalid.
write_active_atomic(models_dir, bundle, policy, reason) — atomically write active.json and append to active_history.jsonl; returns the new ActivePointer.
append_active_history(models_dir, old, new) — append a transition record to active_history.jsonl.
resolve_active_model(models_dir, policy, required_schema_hash, required_label_set) — resolve the active bundle: uses the pointer if valid, falls back to find_best_model and self-heals the pointer; returns (ModelBundle | None, SelectionReport | None).
write_index_cache(models_dir, report) — atomically write models/index.json from a SelectionReport; returns the IndexCache.
read_index_cache(models_dir) — read cached index.json; returns IndexCache or None if missing/invalid.
should_switch_active(current, candidate, policy) — hysteresis check: returns True if the candidate's macro_f1 exceeds the current active model's macro_f1 by at least policy.min_improvement.

`taskclf.model_registry` ¶

Model registry: scan, validate, rank, filter, and activate model bundles.

Provides a pure, testable API for discovering promoted model bundles under models/, checking compatibility with the current schema and label set, ranking candidates by the selection policy, and managing the active model pointer.

Public surface:

:class:BundleMetrics — parsed metrics.json
:class:ModelBundle — one scanned bundle (valid or invalid)
:class:SelectionPolicy — ranking / constraint configuration
:class:ExclusionRecord — why a bundle was excluded from selection
:class:SelectionReport — full result of :func:find_best_model
:class:ActivePointer — persisted active.json pointer
:class:ActiveHistoryEntry — one line in active_history.jsonl
:class:IndexCacheBundleSummary — one bundle row in index.json
:class:IndexCache — cached scan/ranking snapshot
:func:list_bundles — scan a directory for bundles
:func:is_compatible — schema hash + label set gate
:func:passes_constraints — hard constraint gate (policy v1: no-op)
:func:score — sortable ranking tuple
:func:find_best_model — scan, filter, rank, and select the best bundle
:func:read_active — read active.json pointer
:func:write_active_atomic — atomically update active.json
:func:append_active_history — append to active_history.jsonl
:func:resolve_active_model — resolve active bundle with fallback
:func:write_index_cache — write index.json from a selection report
:func:read_index_cache — read cached index.json
:func:should_switch_active — hysteresis check before switching active

`BundleMetrics` ¶

Bases: BaseModel

Metrics stored in a bundle's metrics.json.

See docs/guide/metrics_contract.md for the stable contract.

Source code in src/taskclf/model_registry.py

class BundleMetrics(BaseModel, frozen=True):
    """Metrics stored in a bundle's ``metrics.json``.

    See ``docs/guide/metrics_contract.md`` for the stable contract.
    """

    macro_f1: float
    weighted_f1: float
    confusion_matrix: list[list[int]]
    label_names: list[str]

`ModelBundle` ¶

Bases: BaseModel

A scanned model bundle directory.

Both valid and invalid bundles are represented; check :attr:valid before using :attr:metadata or :attr:metrics.

Source code in src/taskclf/model_registry.py

class ModelBundle(BaseModel, frozen=True):
    """A scanned model bundle directory.

    Both valid and invalid bundles are represented; check :attr:`valid`
    before using :attr:`metadata` or :attr:`metrics`.
    """

    model_id: str
    path: Path
    valid: bool
    invalid_reason: str | None = None
    metadata: ModelMetadata | None = None
    metrics: BundleMetrics | None = None
    created_at: datetime | None = None

`SelectionPolicy` ¶

Bases: BaseModel

Selection policy configuration.

Policy v1 ranks by macro_f1 desc, weighted_f1 desc, created_at desc and applies no additional hard constraints (acceptance gates are enforced at promotion time by retrain).

min_improvement controls hysteresis: the candidate must exceed the current active model's macro_f1 by at least this amount before the active pointer is switched. Set to 0.0 (default) to disable hysteresis.

Source code in src/taskclf/model_registry.py

class SelectionPolicy(BaseModel, frozen=True):
    """Selection policy configuration.

    Policy v1 ranks by ``macro_f1`` desc, ``weighted_f1`` desc,
    ``created_at`` desc and applies no additional hard constraints
    (acceptance gates are enforced at promotion time by retrain).

    ``min_improvement`` controls hysteresis: the candidate must exceed
    the current active model's ``macro_f1`` by at least this amount
    before the active pointer is switched.  Set to ``0.0`` (default)
    to disable hysteresis.
    """

    version: int = 1
    min_improvement: float = 0.0

`ExclusionRecord` ¶

Bases: BaseModel

Why a single bundle was excluded during :func:find_best_model.

Source code in src/taskclf/model_registry.py

class ExclusionRecord(BaseModel, frozen=True):
    """Why a single bundle was excluded during :func:`find_best_model`."""

    model_id: str
    path: Path
    reason: str

`SelectionReport` ¶

Bases: BaseModel

Full result of :func:find_best_model.

ranked contains eligible bundles in score-descending order. best is ranked[0] when the list is non-empty, else None. excluded lists every bundle that was filtered out, with a human-readable reason.

Source code in src/taskclf/model_registry.py

class SelectionReport(BaseModel, frozen=True):
    """Full result of :func:`find_best_model`.

    *ranked* contains eligible bundles in score-descending order.
    *best* is ``ranked[0]`` when the list is non-empty, else ``None``.
    *excluded* lists every bundle that was filtered out, with a
    human-readable reason.
    """

    best: ModelBundle | None
    ranked: list[ModelBundle]
    excluded: list[ExclusionRecord]
    policy: SelectionPolicy
    required_schema_hash: str

`ActivePointer` ¶

Bases: BaseModel

Persisted pointer to the currently active model bundle.

Stored as models/active.json. See docs/guide/model_selection.md for the schema contract.

Source code in src/taskclf/model_registry.py

class ActivePointer(BaseModel, frozen=True):
    """Persisted pointer to the currently active model bundle.

    Stored as ``models/active.json``.  See
    ``docs/guide/model_selection.md`` for the schema contract.
    """

    model_dir: str
    selected_at: str
    policy_version: int
    model_id: str | None = None
    reason: dict[str, object] | None = None

`ActiveHistoryEntry` ¶

Bases: BaseModel

One line in models/active_history.jsonl.

Records every change to active.json for auditability and rollback.

Source code in src/taskclf/model_registry.py

class ActiveHistoryEntry(BaseModel, frozen=True):
    """One line in ``models/active_history.jsonl``.

    Records every change to ``active.json`` for auditability and
    rollback.
    """

    at: str
    old: ActivePointer | None
    new: ActivePointer

`IndexCacheBundleSummary` ¶

Bases: BaseModel

Summary of one bundle stored inside :class:IndexCache.

Source code in src/taskclf/model_registry.py

class IndexCacheBundleSummary(BaseModel, frozen=True):
    """Summary of one bundle stored inside :class:`IndexCache`."""

    model_id: str
    path: str
    macro_f1: float | None = None
    weighted_f1: float | None = None
    created_at: str | None = None
    eligible: bool = False

`IndexCache` ¶

Bases: BaseModel

Cached scan/ranking snapshot written to models/index.json.

This is an informational cache — selection never reads it. Operators and taskclf train list may consume it for fast inspection without a full rescan.

Source code in src/taskclf/model_registry.py

class IndexCache(BaseModel, frozen=True):
    """Cached scan/ranking snapshot written to ``models/index.json``.

    This is an informational cache — selection never reads it.
    Operators and ``taskclf train list`` may consume it for fast
    inspection without a full rescan.
    """

    generated_at: str
    schema_hash: str
    policy_version: int
    ranked: list[IndexCacheBundleSummary]
    excluded: list[ExclusionRecord]
    best_model_id: str | None = None

`list_bundles(models_dir)` ¶

Scan models_dir for model bundle subdirectories.

Each subdirectory is parsed independently; failures are captured as invalid bundles rather than aborting the scan.

Parameters:

Name	Type	Description	Default
`models_dir`	`Path`	Parent directory containing bundle subdirectories (e.g. `Path("models")`).	required

Returns:

Type	Description
`list[ModelBundle]`	A list of :class:`ModelBundle` instances sorted by `model_id`
`list[ModelBundle]`	for deterministic ordering.

Source code in src/taskclf/model_registry.py

def list_bundles(models_dir: Path) -> list[ModelBundle]:
    """Scan *models_dir* for model bundle subdirectories.

    Each subdirectory is parsed independently; failures are captured as
    invalid bundles rather than aborting the scan.

    Args:
        models_dir: Parent directory containing bundle subdirectories
            (e.g. ``Path("models")``).

    Returns:
        A list of :class:`ModelBundle` instances sorted by ``model_id``
        for deterministic ordering.
    """
    if not models_dir.is_dir():
        return []

    bundles: list[ModelBundle] = []
    for candidate in models_dir.iterdir():
        if not candidate.is_dir():
            continue
        bundle = _parse_bundle(candidate)
        bundles.append(bundle)

    bundles.sort(key=lambda b: b.model_id)
    return bundles

`is_compatible(bundle, required_schema_hash=get_feature_schema('v1').SCHEMA_HASH, required_label_set=LABEL_SET_V1)` ¶

Check whether bundle is compatible with the current runtime.

A bundle is compatible when both hold:

metadata.schema_hash exactly matches required_schema_hash.
sorted(metadata.label_set) exactly matches sorted(required_label_set).

Parameters:

Name	Type	Description	Default
`bundle`	`ModelBundle`	A scanned model bundle.	required
`required_schema_hash`	`str`	Expected schema hash (defaults to `FeatureSchemaV1.SCHEMA_HASH`).	`SCHEMA_HASH`
`required_label_set`	`frozenset[str]`	Expected label vocabulary (defaults to `LABEL_SET_V1`).	`LABEL_SET_V1`

Returns:

Type	Description
`bool`	`True` if the bundle is valid and compatible.

Source code in src/taskclf/model_registry.py

def is_compatible(
    bundle: ModelBundle,
    required_schema_hash: str = get_feature_schema("v1").SCHEMA_HASH,
    required_label_set: frozenset[str] = LABEL_SET_V1,
) -> bool:
    """Check whether *bundle* is compatible with the current runtime.

    A bundle is compatible when both hold:

    1. ``metadata.schema_hash`` exactly matches *required_schema_hash*.
    2. ``sorted(metadata.label_set)`` exactly matches ``sorted(required_label_set)``.

    Args:
        bundle: A scanned model bundle.
        required_schema_hash: Expected schema hash (defaults to
            ``FeatureSchemaV1.SCHEMA_HASH``).
        required_label_set: Expected label vocabulary (defaults to
            ``LABEL_SET_V1``).

    Returns:
        ``True`` if the bundle is valid and compatible.
    """
    if not bundle.valid or bundle.metadata is None:
        return False
    if bundle.metadata.schema_hash != required_schema_hash:
        return False
    if sorted(bundle.metadata.label_set) != sorted(required_label_set):
        return False
    return True

`passes_constraints(bundle, policy)` ¶

Check whether bundle passes the hard constraints of policy.

Policy v1 applies no additional constraints beyond validity: all acceptance gates are enforced at promotion time by retrain, so any promoted bundle that parsed successfully is eligible for ranking.

Parameters:

Name	Type	Description	Default
`bundle`	`ModelBundle`	A scanned model bundle (must be valid).	required
`policy`	`SelectionPolicy`	Selection policy configuration.	required

Returns:

Type	Description
`bool`	`True` if the bundle is valid with non-null metrics.

Source code in src/taskclf/model_registry.py

def passes_constraints(
    bundle: ModelBundle,
    policy: SelectionPolicy,  # noqa: ARG001 — reserved for future policy versions
) -> bool:
    """Check whether *bundle* passes the hard constraints of *policy*.

    Policy v1 applies no additional constraints beyond validity: all
    acceptance gates are enforced at promotion time by retrain, so any
    promoted bundle that parsed successfully is eligible for ranking.

    Args:
        bundle: A scanned model bundle (must be valid).
        policy: Selection policy configuration.

    Returns:
        ``True`` if the bundle is valid with non-null metrics.
    """
    return bundle.valid and bundle.metrics is not None

`score(bundle, policy)` ¶

Compute a sortable ranking key for bundle.

The tuple sorts descending on all three components:

macro_f1 (higher is better)
weighted_f1 (tie-break; higher is better)
created_at (tie-break; newer is better — ISO8601 strings with the same UTC offset sort lexicographically)

Parameters:

Name	Type	Description	Default
`bundle`	`ModelBundle`	A valid model bundle with non-null metrics and metadata.	required
`policy`	`SelectionPolicy`	Selection policy configuration.	required

Returns:

Type	Description
`float`	A 3-tuple `(macro_f1, weighted_f1, created_at_raw)` suitable
`float`	for `sorted(..., reverse=True)`.

Raises:

Type	Description
`ValueError`	If the bundle is invalid or missing metrics/metadata.

Source code in src/taskclf/model_registry.py

def score(
    bundle: ModelBundle,
    policy: SelectionPolicy,  # noqa: ARG001 — reserved for future policy versions
) -> tuple[float, float, str]:
    """Compute a sortable ranking key for *bundle*.

    The tuple sorts **descending** on all three components:

    1. ``macro_f1`` (higher is better)
    2. ``weighted_f1`` (tie-break; higher is better)
    3. ``created_at`` (tie-break; newer is better — ISO8601 strings
       with the same UTC offset sort lexicographically)

    Args:
        bundle: A valid model bundle with non-null metrics and metadata.
        policy: Selection policy configuration.

    Returns:
        A 3-tuple ``(macro_f1, weighted_f1, created_at_raw)`` suitable
        for ``sorted(..., reverse=True)``.

    Raises:
        ValueError: If the bundle is invalid or missing metrics/metadata.
    """
    if not bundle.valid or bundle.metrics is None or bundle.metadata is None:
        raise ValueError(f"Cannot score invalid bundle {bundle.model_id!r}")
    return (
        bundle.metrics.macro_f1,
        bundle.metrics.weighted_f1,
        bundle.metadata.created_at,
    )

`find_best_model(models_dir, policy=None, required_schema_hash=None, required_label_set=None)` ¶

Scan, filter, rank, and select the best model bundle.

This is the main entry-point for non-mutating model selection. It composes :func:list_bundles, :func:is_compatible, :func:passes_constraints, and :func:score into a single call that returns a structured :class:SelectionReport.

Parameters:

Name	Type	Description	Default
`models_dir`	`Path`	Directory containing promoted model bundle subdirectories (e.g. `Path("models")`).	required
`policy`	`SelectionPolicy \| None`	Selection policy configuration. Defaults to `SelectionPolicy()` (policy v1).	`None`
`required_schema_hash`	`str \| None`	Schema hash that bundles must match. Defaults to `FeatureSchemaV1.SCHEMA_HASH`.	`None`
`required_label_set`	`frozenset[str] \| None`	Label vocabulary that bundles must match. Defaults to `LABEL_SET_V1`.	`None`

Returns:

Name	Type	Description
`A`	`SelectionReport`	class:`SelectionReport` with the best bundle (if any),
	`SelectionReport`	the full ranked list of eligible bundles, and exclusion
	`SelectionReport`	records for every bundle that was filtered out.

Source code in src/taskclf/model_registry.py

def find_best_model(
    models_dir: Path,
    policy: SelectionPolicy | None = None,
    required_schema_hash: str | None = None,
    required_label_set: frozenset[str] | None = None,
) -> SelectionReport:
    """Scan, filter, rank, and select the best model bundle.

    This is the main entry-point for non-mutating model selection.
    It composes :func:`list_bundles`, :func:`is_compatible`,
    :func:`passes_constraints`, and :func:`score` into a single call
    that returns a structured :class:`SelectionReport`.

    Args:
        models_dir: Directory containing promoted model bundle
            subdirectories (e.g. ``Path("models")``).
        policy: Selection policy configuration.  Defaults to
            ``SelectionPolicy()`` (policy v1).
        required_schema_hash: Schema hash that bundles must match.
            Defaults to ``FeatureSchemaV1.SCHEMA_HASH``.
        required_label_set: Label vocabulary that bundles must match.
            Defaults to ``LABEL_SET_V1``.

    Returns:
        A :class:`SelectionReport` with the best bundle (if any),
        the full ranked list of eligible bundles, and exclusion
        records for every bundle that was filtered out.
    """
    explicit_schema_hash = required_schema_hash is not None

    if policy is None:
        policy = SelectionPolicy()
    if required_label_set is None:
        required_label_set = LABEL_SET_V1
    if required_schema_hash is not None:
        schema_hashes: tuple[str, ...] = (required_schema_hash,)
    else:
        schema_hashes = tuple(
            get_feature_schema(version).SCHEMA_HASH
            for version in iter_feature_schema_versions(LATEST_FEATURE_SCHEMA_VERSION)
        )

    last_report: SelectionReport | None = None
    for candidate_schema_hash in schema_hashes:
        bundles = list_bundles(models_dir)

        excluded: list[ExclusionRecord] = []
        eligible: list[ModelBundle] = []

        for bundle in bundles:
            if not bundle.valid:
                excluded.append(
                    ExclusionRecord(
                        model_id=bundle.model_id,
                        path=bundle.path,
                        reason=f"invalid: {bundle.invalid_reason}",
                    )
                )
                continue

            if not is_compatible(bundle, candidate_schema_hash, required_label_set):
                assert bundle.metadata is not None
                if bundle.metadata.schema_hash != candidate_schema_hash:
                    detail = "schema_hash mismatch"
                else:
                    detail = "label_set mismatch"
                excluded.append(
                    ExclusionRecord(
                        model_id=bundle.model_id,
                        path=bundle.path,
                        reason=f"incompatible: {detail}",
                    )
                )
                continue

            if not passes_constraints(bundle, policy):
                excluded.append(
                    ExclusionRecord(
                        model_id=bundle.model_id,
                        path=bundle.path,
                        reason="constraint: failed policy constraints",
                    )
                )
                continue

            eligible.append(bundle)

        ranked = sorted(
            eligible,
            key=lambda b: score(b, policy),
            reverse=True,
        )

        last_report = SelectionReport(
            best=ranked[0] if ranked else None,
            ranked=ranked,
            excluded=excluded,
            policy=policy,
            required_schema_hash=candidate_schema_hash,
        )
        if ranked or explicit_schema_hash:
            return last_report

    assert last_report is not None
    return last_report

`read_active(models_dir)` ¶

Read the active model pointer from models_dir/active.json.

Returns None (without raising) when the file is missing, contains invalid JSON, or fails :class:ActivePointer validation. A warning is logged on parse/validation failures so operators can notice stale pointer files.

Source code in src/taskclf/model_registry.py

def read_active(models_dir: Path) -> ActivePointer | None:
    """Read the active model pointer from ``models_dir/active.json``.

    Returns ``None`` (without raising) when the file is missing,
    contains invalid JSON, or fails :class:`ActivePointer` validation.
    A warning is logged on parse/validation failures so operators can
    notice stale pointer files.
    """
    active_path = models_dir / _ACTIVE_FILE
    if not active_path.is_file():
        return None

    try:
        raw = json.loads(active_path.read_text())
    except (json.JSONDecodeError, OSError) as exc:
        logger.warning("active.json parse error: %s", exc)
        return None

    try:
        return ActivePointer.model_validate(raw)
    except Exception as exc:
        logger.warning("active.json validation error: %s", exc)
        return None

`append_active_history(models_dir, old, new)` ¶

Append a transition record to models_dir/active_history.jsonl.

Creates the file if it does not exist. Each line is a self-contained JSON object matching :class:ActiveHistoryEntry.

Source code in src/taskclf/model_registry.py

def append_active_history(
    models_dir: Path,
    old: ActivePointer | None,
    new: ActivePointer,
) -> None:
    """Append a transition record to ``models_dir/active_history.jsonl``.

    Creates the file if it does not exist.  Each line is a
    self-contained JSON object matching :class:`ActiveHistoryEntry`.
    """
    entry = ActiveHistoryEntry(at=new.selected_at, old=old, new=new)
    history_path = models_dir / _HISTORY_FILE
    with history_path.open("a") as fh:
        fh.write(entry.model_dump_json() + "\n")

`write_active_atomic(models_dir, bundle, policy, reason=None)` ¶

Atomically write models_dir/active.json for bundle.

The pointer is written to a temporary file first, then moved into place with :func:os.replace to guarantee readers never see a partial write. The previous pointer (if any) is read before the overwrite and both old and new are appended to the audit log via :func:append_active_history.

Parameters:

Name	Type	Description	Default
`models_dir`	`Path`	The `models/` directory.	required
`bundle`	`ModelBundle`	The bundle to activate (must be valid with metrics).	required
`policy`	`SelectionPolicy`	The selection policy used to choose this bundle.	required
`reason`	`str \| None`	Optional human-readable reason string.	`None`

Returns:

Type	Description
`ActivePointer`	The newly written :class:`ActivePointer`.

Source code in src/taskclf/model_registry.py

def write_active_atomic(
    models_dir: Path,
    bundle: ModelBundle,
    policy: SelectionPolicy,
    reason: str | None = None,
) -> ActivePointer:
    """Atomically write ``models_dir/active.json`` for *bundle*.

    The pointer is written to a temporary file first, then moved into
    place with :func:`os.replace` to guarantee readers never see a
    partial write.  The previous pointer (if any) is read before the
    overwrite and both old and new are appended to the audit log via
    :func:`append_active_history`.

    Args:
        models_dir: The ``models/`` directory.
        bundle: The bundle to activate (must be valid with metrics).
        policy: The selection policy used to choose this bundle.
        reason: Optional human-readable reason string.

    Returns:
        The newly written :class:`ActivePointer`.
    """
    old = read_active(models_dir)

    reason_dict: dict[str, object] | None = None
    if bundle.metrics is not None:
        reason_dict = {
            "metric": "macro_f1",
            "macro_f1": bundle.metrics.macro_f1,
            "weighted_f1": bundle.metrics.weighted_f1,
        }
        if reason is not None:
            reason_dict["note"] = reason

    model_dir = str(bundle.path.relative_to(models_dir.parent))

    pointer = ActivePointer(
        model_dir=model_dir,
        model_id=bundle.model_id,
        selected_at=datetime.now(UTC).isoformat(),
        policy_version=policy.version,
        reason=reason_dict,
    )

    tmp_path = models_dir / _ACTIVE_TMP
    final_path = models_dir / _ACTIVE_FILE

    tmp_path.write_text(pointer.model_dump_json(indent=2) + "\n")
    os.replace(tmp_path, final_path)

    append_active_history(models_dir, old, pointer)

    return pointer

`resolve_active_model(models_dir, policy=None, required_schema_hash=None, required_label_set=None)` ¶

Resolve the active model bundle, falling back to selection.

Resolution order:

Read active.json. If valid and the pointed-to bundle exists, is parseable, and is compatible — return it immediately (no full scan).
Otherwise fall back to :func:find_best_model. If a best bundle is found, atomically update active.json to self-heal the pointer.

Returns:

Type	Description
`ModelBundle \| None`	A 2-tuple `(bundle, report)`. report is `None` when the
`SelectionReport \| None`	pointer was valid and no scan was needed.

Source code in src/taskclf/model_registry.py

def resolve_active_model(
    models_dir: Path,
    policy: SelectionPolicy | None = None,
    required_schema_hash: str | None = None,
    required_label_set: frozenset[str] | None = None,
) -> tuple[ModelBundle | None, SelectionReport | None]:
    """Resolve the active model bundle, falling back to selection.

    Resolution order:

    1. Read ``active.json``.  If valid and the pointed-to bundle
       exists, is parseable, and is compatible — return it immediately
       (no full scan).
    2. Otherwise fall back to :func:`find_best_model`.  If a best
       bundle is found, atomically update ``active.json`` to self-heal
       the pointer.

    Returns:
        A 2-tuple ``(bundle, report)``.  *report* is ``None`` when the
        pointer was valid and no scan was needed.
    """
    if policy is None:
        policy = SelectionPolicy()
    if required_label_set is None:
        required_label_set = LABEL_SET_V1

    pointer = read_active(models_dir)
    if pointer is not None:
        bundle_path = models_dir.parent / pointer.model_dir
        if bundle_path.is_dir():
            bundle = _parse_bundle(bundle_path)
            metadata = bundle.metadata
            if bundle.valid and metadata is not None:
                if required_schema_hash is None:
                    if _matches_declared_schema(bundle) and sorted(
                        metadata.label_set
                    ) == sorted(required_label_set):
                        return bundle, None
                elif is_compatible(bundle, required_schema_hash, required_label_set):
                    return bundle, None
            logger.warning(
                "active.json points to invalid/incompatible bundle %s; "
                "falling back to selection",
                pointer.model_dir,
            )
        else:
            logger.warning(
                "active.json points to missing directory %s; falling back to selection",
                pointer.model_dir,
            )

    report = find_best_model(
        models_dir,
        policy,
        required_schema_hash,
        required_label_set,
    )

    if report.best is not None:
        write_active_atomic(models_dir, report.best, policy, reason="auto-repair")

    return report.best, report

`write_index_cache(models_dir, report)` ¶

Write models_dir/index.json from a :class:SelectionReport.

The cache is written atomically (temp + :func:os.replace). It is informational only — :func:find_best_model never reads it.

Parameters:

Name	Type	Description	Default
`models_dir`	`Path`	The `models/` directory.	required
`report`	`SelectionReport`	A completed selection report.	required

Returns:

Name	Type	Description
`The`	`IndexCache`	class:`IndexCache` that was persisted.

Source code in src/taskclf/model_registry.py

def write_index_cache(
    models_dir: Path,
    report: SelectionReport,
) -> IndexCache:
    """Write ``models_dir/index.json`` from a :class:`SelectionReport`.

    The cache is written atomically (temp + :func:`os.replace`).  It is
    informational only — :func:`find_best_model` never reads it.

    Args:
        models_dir: The ``models/`` directory.
        report: A completed selection report.

    Returns:
        The :class:`IndexCache` that was persisted.
    """
    ranked_summaries: list[IndexCacheBundleSummary] = []
    for b in report.ranked:
        ranked_summaries.append(
            IndexCacheBundleSummary(
                model_id=b.model_id,
                path=str(b.path),
                macro_f1=b.metrics.macro_f1 if b.metrics else None,
                weighted_f1=b.metrics.weighted_f1 if b.metrics else None,
                created_at=b.metadata.created_at if b.metadata else None,
                eligible=True,
            )
        )

    cache = IndexCache(
        generated_at=datetime.now(UTC).isoformat(),
        schema_hash=report.required_schema_hash,
        policy_version=report.policy.version,
        ranked=ranked_summaries,
        excluded=report.excluded,
        best_model_id=report.best.model_id if report.best else None,
    )

    tmp_path = models_dir / _INDEX_TMP
    final_path = models_dir / _INDEX_FILE

    tmp_path.write_text(cache.model_dump_json(indent=2) + "\n")
    os.replace(tmp_path, final_path)

    return cache

`read_index_cache(models_dir)` ¶

Read the cached index from models_dir/index.json.

Returns None (without raising) when the file is missing, contains invalid JSON, or fails :class:IndexCache validation.

Source code in src/taskclf/model_registry.py

def read_index_cache(models_dir: Path) -> IndexCache | None:
    """Read the cached index from ``models_dir/index.json``.

    Returns ``None`` (without raising) when the file is missing,
    contains invalid JSON, or fails :class:`IndexCache` validation.
    """
    index_path = models_dir / _INDEX_FILE
    if not index_path.is_file():
        return None

    try:
        raw = json.loads(index_path.read_text())
    except (json.JSONDecodeError, OSError) as exc:
        logger.warning("index.json parse error: %s", exc)
        return None

    try:
        return IndexCache.model_validate(raw)
    except Exception as exc:
        logger.warning("index.json validation error: %s", exc)
        return None

`should_switch_active(current, candidate, policy)` ¶

Decide whether candidate should replace current as active.

When policy.min_improvement is positive, the candidate's macro_f1 must exceed the current active model's macro_f1 by at least that amount. If current is None or has no recorded macro_f1, the switch is always allowed.

Parameters:

Name	Type	Description	Default
`current`	`ActivePointer \| None`	The current active pointer (may be `None`).	required
`candidate`	`ModelBundle`	The best-ranked bundle from selection.	required
`policy`	`SelectionPolicy`	Selection policy with hysteresis threshold.	required

Returns:

Type	Description
`bool`	`True` if the active pointer should be updated.

Source code in src/taskclf/model_registry.py

def should_switch_active(
    current: ActivePointer | None,
    candidate: ModelBundle,
    policy: SelectionPolicy,
) -> bool:
    """Decide whether *candidate* should replace *current* as active.

    When ``policy.min_improvement`` is positive, the candidate's
    ``macro_f1`` must exceed the current active model's ``macro_f1``
    by at least that amount.  If ``current`` is ``None`` or has no
    recorded ``macro_f1``, the switch is always allowed.

    Args:
        current: The current active pointer (may be ``None``).
        candidate: The best-ranked bundle from selection.
        policy: Selection policy with hysteresis threshold.

    Returns:
        ``True`` if the active pointer should be updated.
    """
    if current is None or policy.min_improvement <= 0.0:
        return True

    if candidate.metrics is None:
        return False

    current_f1: float | None = None
    if current.reason and "macro_f1" in current.reason:
        try:
            current_f1 = float(current.reason["macro_f1"])  # type: ignore[arg-type]
        except TypeError, ValueError:
            logger.debug(
                "Could not parse macro_f1 from active pointer reason", exc_info=True
            )

    if current_f1 is None:
        return True

    return candidate.metrics.macro_f1 >= current_f1 + policy.min_improvement

model_registry¶

BundleMetrics Fields¶

ModelBundle Fields¶

SelectionPolicy Fields¶

ExclusionRecord Fields¶

SelectionReport Fields¶

ActivePointer Fields¶

ActiveHistoryEntry Fields¶

IndexCacheBundleSummary Fields¶

IndexCache Fields¶

Functions¶

taskclf.model_registry ¶

BundleMetrics ¶

ModelBundle ¶

SelectionPolicy ¶

ExclusionRecord ¶

SelectionReport ¶

ActivePointer ¶

ActiveHistoryEntry ¶

IndexCacheBundleSummary ¶

IndexCache ¶

list_bundles(models_dir) ¶

is_compatible(bundle, required_schema_hash=get_feature_schema('v1').SCHEMA_HASH, required_label_set=LABEL_SET_V1) ¶

passes_constraints(bundle, policy) ¶

score(bundle, policy) ¶

find_best_model(models_dir, policy=None, required_schema_hash=None, required_label_set=None) ¶

read_active(models_dir) ¶

append_active_history(models_dir, old, new) ¶

write_active_atomic(models_dir, bundle, policy, reason=None) ¶

resolve_active_model(models_dir, policy=None, required_schema_hash=None, required_label_set=None) ¶

write_index_cache(models_dir, report) ¶

read_index_cache(models_dir) ¶

should_switch_active(current, candidate, policy) ¶

`taskclf.model_registry` ¶

`BundleMetrics` ¶

`ModelBundle` ¶

`SelectionPolicy` ¶

`ExclusionRecord` ¶

`SelectionReport` ¶

`ActivePointer` ¶

`ActiveHistoryEntry` ¶

`IndexCacheBundleSummary` ¶

`IndexCache` ¶

`list_bundles(models_dir)` ¶

`is_compatible(bundle, required_schema_hash=get_feature_schema('v1').SCHEMA_HASH, required_label_set=LABEL_SET_V1)` ¶

`passes_constraints(bundle, policy)` ¶

`score(bundle, policy)` ¶

`find_best_model(models_dir, policy=None, required_schema_hash=None, required_label_set=None)` ¶

`read_active(models_dir)` ¶

`append_active_history(models_dir, old, new)` ¶

`write_active_atomic(models_dir, bundle, policy, reason=None)` ¶

`resolve_active_model(models_dir, policy=None, required_schema_hash=None, required_label_set=None)` ¶

`write_index_cache(models_dir, report)` ¶

`read_index_cache(models_dir)` ¶

`should_switch_active(current, candidate, policy)` ¶