features/domain¶

Privacy-preserving browser domain classification. Maps eTLD+1 domains to semantic categories without storing raw URLs.

`taskclf.features.domain` ¶

Privacy-preserving browser domain classification.

Maps eTLD+1 domains (e.g. "github.com") to semantic categories without storing full URLs, paths, or query strings. Only the domain category string is persisted — never the raw domain or URL.

When no domain information is available (e.g. no aw-watcher-web integration), the classifier falls back to "unknown" for browser apps and "non_browser" for non-browser apps.

See docs/guide/privacy.md §3.4 for the data-handling contract.

`classify_domain(domain, *, is_browser=True)` ¶

Map a domain string to a privacy-safe category.

Parameters:

Name	Type	Description	Default
`domain`	`str \| None`	An eTLD+1 or subdomain string (e.g. `"github.com"`). `None` when domain information is unavailable.	required
`is_browser`	`bool`	Whether the foreground app is a browser.	`True`

Returns:

Type	Description
`str`	One of :data:`DOMAIN_CATEGORIES`.

Source code in src/taskclf/features/domain.py

def classify_domain(domain: str | None, *, is_browser: bool = True) -> str:
    """Map a domain string to a privacy-safe category.

    Args:
        domain: An eTLD+1 or subdomain string (e.g. ``"github.com"``).
            ``None`` when domain information is unavailable.
        is_browser: Whether the foreground app is a browser.

    Returns:
        One of :data:`DOMAIN_CATEGORIES`.
    """
    if not is_browser:
        return "non_browser"
    if domain is None:
        return "unknown"
    domain = domain.lower().strip()
    if not domain:
        return "unknown"

    if domain in _DOMAIN_RULES:
        return _DOMAIN_RULES[domain]

    # Try parent domain (e.g. "mail.google.com" -> "google.com")
    parts = domain.split(".")
    if len(parts) > 2:
        parent = ".".join(parts[-2:])
        if parent in _DOMAIN_RULES:
            return _DOMAIN_RULES[parent]

    return "other"

features/domain¶

taskclf.features.domain ¶

classify_domain(domain, *, is_browser=True) ¶

`taskclf.features.domain` ¶

`classify_domain(domain, *, is_browser=True)` ¶