Skip to content

Restructure options: typed options() reads + config writes + ContextVar overrides; drop get_option/set_option and ColumnHelper/PlotStyleHelper #517

Description

@murray-ds

Background

Pyright reports ~57 errors in openretailscience/options.py. The root cause is the single catch-all value type:

OptionTypes = str | int | float | bool | list | dict | None

Every option lives in one dict[str, OptionTypes] and get_option(pat: str) returns that whole union. When consumers assign the union to narrowly-typed attributes (ColumnHelper, AggColumns, CalcColumns, PlotStyleHelper) pyright cannot narrow it to the concrete str/float/bool/list[float] each expects. The bulk of the errors are these helper-class attribute assignments, plus a few direct get_option(...) calls passed into typed parameters, plus a couple of Path vs str annotation mismatches in find_project_root/load_from_toml.

Beyond the errors, the current design has three parallel dicts (_options, _default_options, _descriptions) keyed by dotted strings that can silently drift out of sync.

Goals

  1. Better IDE and AI ergonomics — autocomplete, go-to-definition, discoverable option names, types an IDE/LLM can follow. (Primary goal.)
  2. Concrete static typing for both reads and writes — reading or writing an option yields/requires its concrete type (a column name is str, a font size is float), checkable by pyright.
  3. Preserve the string-keyed option_context API verbatim — both option_context("column.customer_id", "cust_id", ...) and the dict form option_context({"column.customer_id": "cust_id"}). Used heavily; must not change. After this work it is the only string-keyed public surface.
  4. Persistent, process-global configuration set at the start of a script, visible to all subsequent reads across all modules (and threads) — now via typed attribute assignment on a global config object (and TOML), replacing the string set_option.
  5. Honor the repo Simplicity Criterion: no new dependencies (pydantic was considered and rejected — not in the tree, and this is a library), no codegen/stub-sync machinery, Python 3.10+ typing.

Chosen design: Approach A + B with typed read/write surfaces

A single typed config tree is the source of truth. Reads go through a frozen, override-aware options() facade; persistent writes go through a mutable, typed config object (attribute assignment); temporary overrides go through the unchanged string-keyed option_context. The public string functions get_option/set_option are removed.

Source of truth — nested dataclasses, descriptions as attribute docstrings

Each field carries its type, default, and description. The description is an attribute docstring — a string literal on the line after the field — which is what Pylance/PyCharm surface on hover/autocomplete and what an AI reading the source sees. (field(metadata=...) is deliberately not used — invisible to IDEs. Trade-off: attribute docstrings are not readable at runtime, which is why describe_option is removed.)

The base config tree is mutable (it is the writable global); the read views and the ContextVar overlay are frozen/snapshot-based.

@dataclass                                    # mutable: this is the writable global base
class AggConfig:
    customer_id: str = "customers"
    """Column for the unique customer count."""
    unit_spend: str = "spend"
    """Column for total spend."""
    # ...

@dataclass
class FontConfig:
    title_size: float = 22.0
    """Font size for plot titles, in points."""
    # ...

@dataclass
class ColumnConfig:
    customer_id: str = "customer_id"
    """Name of the column holding customer IDs."""
    # ... base columns
    agg: AggConfig = field(default_factory=AggConfig)
    calc: CalcConfig = field(default_factory=CalcConfig)
    suffix: SuffixConfig = field(default_factory=SuffixConfig)

@dataclass
class RootConfig:
    column: ColumnConfig = field(default_factory=ColumnConfig)
    plot: PlotConfig = field(default_factory=PlotConfig)   # color / font / style / spacing

Writes — typed config object (the new persistent-set path)

config: RootConfig = _bootstrap()             # mutable global base; TOML-seeded at import; re-exported as ors.config

# at the top of a script:
import openretailscience as ors
ors.config.column.customer_id = "cust_id"     # typed, autocompleted, STATICALLY checked
ors.config.plot.font.title_size = 18.0
ors.config.plot.style.show_tab = False

A wrong-typed write (ors.config.plot.font.title_size = "oops") is a pyright error at edit time, not a runtime failure. This replaces the string set_option entirely.

Reads — frozen, override-aware options() facade

options() returns one typed facade over the effective config (active override else base). Intermediate nodes (.column, .plot) are typed sub-views you can bind. The ~65 derived suffix names (e.g. unit_spend_p1 = base + _ + suffix) are typed one-line properties on the views (they need cross-node access to column.suffix).

def _effective() -> RootConfig:
    return _active.get() or config

def options() -> _Options:
    return _Options(_effective())

cols = options().column          # ColumnView — replaces `cols = ColumnHelper()`
cols.agg.unit_spend_p1           # str
style = options().plot           # PlotView   — replaces `style = PlotStyleHelper()`
style.font.title_size            # float
options().plot.color.primary     # str        — replaces get_option("plot.color.primary")

Write via config, read via options(). Reading from config directly gives the configured base and ignores temporary option_context overrides — always read through options().

Temporary overrides — option_context (the sole remaining string API)

_active: ContextVar[RootConfig | None] = ContextVar("ors_active", default=None)

@contextmanager
def option_context(*args):                    # SAME public signature (dict form + alternating pairs)
    items = _normalize(args)
    store = deepcopy(_effective())            # mutable copy of the effective config
    for pat, val in items:
        _set_path(store, pat, val)            # validated dotted-path setattr (uses _check_type)
    token = _active.set(store)
    try:
        yield
    finally:
        _active.reset(token)
  • Persistent config (config assignment / TOML) → visible to every later read in every module and thread (module global). ✔ Goal 4
  • option_context keeps its exact signature, stacks a derived config in a ContextVar → overrides are thread/async-local and compose under nesting, restored via the token (fixes today's latent race). ✔ Goal 3

Internal plumbing (private)

_check_type, _set_path/_with_override (dotted-path setattr with type validation), _read_path, _bootstrap (TOML seed). These back option_context and TOML loading — the string/dynamic boundary that still needs runtime validation. They are not part of the public API.

Removed public API (breaking; no production callers for most)

Not re-exported from the package root; only importable from openretailscience.options.

  • get_option — replaced by typed reads via options(). Every get_option("...") site migrates to attribute access. There is no string read escape hatch left (the one dynamic read, get_named_color, becomes a typed Literal map).
  • set_option — replaced by typed config.x.y = v (and TOML). No dynamic/computed-key persistent set remains (accepted trade-off; private _set_path exists if ever needed).
  • describe_option — descriptions now live in attribute docstrings (IDE/source); current value is options().....
  • reset_option — unused; option_context restores via its token, defaults come from a fresh RootConfig().
  • list_options — superseded by attribute access + autocomplete for discovery and print(config)/print(options()) (dataclass __repr__) for a full dump.

Surviving public API: options() (reads), config (typed writes), option_context (temporary overrides), and TOML loading.

Scope of work

options.py

  • Introduce nested dataclasses (RootConfig + namespace configs) with defaults and attribute-docstring descriptions; delete the three parallel dicts. Base tree mutable; read views frozen.
  • Add _check_type, _set_path/_with_override (validated dotted-path setattr), _read_path, _bootstrap.
  • Add the global config object (mutable RootConfig) and re-export it (ors.config).
  • Add _active ContextVar + _effective(); add options() returning the _Options read facade with .column/.plot sub-views and the ~65 derived-name properties.
  • Keep option_context with its current public signature (now ContextVar-backed).
  • Remove get_option, set_option, describe_option, reset_option, list_options (module-level functions and Options-class methods); remove ColumnHelper, PlotStyleHelper, AggColumns, CalcColumns (fold into internal read views).
  • Fix find_project_root return type (Path | None) and the TOML loader signature.

Call-site migration (larger now that get_option/set_option are gone)

  • All get_option("...") reads (~30 source sites; 227 occurrences incl. tests) → typed attribute access via options().
  • All set_option("...", v)config.x.y = v.
  • ColumnHelper()options().column; PlotStyleHelper()options().plot (note the extra path level: style.title_sizeoptions().plot.font.title_size, plus .style.*/.color.*/.spacing.*).
  • get_named_color (colors.py): replace get_option(f"plot.color.{color_type}") with an explicit typed map over a Literal[...] of the bounded color names — removes the only dynamic read.
  • Fetch options() (or a sub-view) at point of use inside functions, never cached at module/class scope, so reads reflect active option_context overrides.

Docs

  • Rewrite docs/getting_started/options_guide.md: remove get_option/set_option/reset_option/describe_option/list_options sections; document options() reads, config typed writes, option_context, and attribute-docstring descriptions.

Tests

  • Update tests/test_options.py: remove tests for the five removed functions; keep/adapt option_context tests (signature unchanged).
  • Migrate tests that call get_option(...) (incl. the dynamic get_option(f"plot.color.{name}") loops in tests/plots/test_tree_diagram.py) to typed access or an internal helper.
  • Add tests: typed reads via options(); typed writes via config (incl. persistence/global visibility); option_context thread/async isolation and nesting; _check_type rejects wrong-typed TOML/option_context values; derived suffix names; TOML load round-trip.

Known trade-offs (accepted)

  • Bigger migration: with no string read/write API, all get_option/set_option sites must move to options()/config. Mechanical but broad.
  • Typed write path is statically checked, not runtime-validatedconfig.x.y = v relies on pyright; runtime _check_type guards only the string/dynamic boundary (option_context, TOML). Accepted (matches Python norms; reads/writes are typed).
  • No dynamic persistent set (string set_option gone). Use typed config assignment or TOML; private _set_path remains if a dynamic need ever appears.
  • The ~65 derived suffix names are not reduced — they become typed one-line properties (codegen to collapse them is out of scope).
  • Descriptions live in attribute docstrings, not runtime-readable (hence no describe_option).
  • ContextVar introduces a deliberate asymmetry (persistent config/TOML is cross-thread; option_context overrides are thread/async-local) — worth a doc comment.
  • legend_bbox_to_anchor becomes a tuple (matplotlib accepts it; code treating it as a mutable list needs a tweak).

Acceptance criteria

  • pyright openretailscience/options.py reports 0 errors, and no new pyright errors at migrated call sites; a wrong-typed config write is a static error.
  • uv run pytest green; uv run ruff check . / format clean.
  • option_context (both forms) behaves exactly as before for users; persistent config/TOML changes are globally visible.
  • options() provides typed, autocompletable reads for all namespaces, override-aware at point of use; descriptions show on IDE hover.
  • get_option, set_option, describe_option, reset_option, list_options, ColumnHelper, and PlotStyleHelper are removed, with all call sites/docs/tests migrated.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Fields

No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions