-
Notifications
You must be signed in to change notification settings - Fork 0
Automation and Validation
Automation keeps the PKM consistent while preserving manual writing.
Validation and schema rules are defined in Schema Contract.
Automation behavior is configured in:
schema/automation.json-
schema/automation.example.json(copy/adapt as a starting point)
- Run the full automation pipeline in the correct order.
- Stop immediately if any step fails.
Run it from repository root:
python scripts/automation/run_all.pyPortable launcher (recommended):
bash scripts/runtime/pkm_python.sh scripts/automation/run_all.py- Create missing baseline entity notes from CSV entity tables.
- Render in-note generated blocks in place.
- Preserve all non-generated note content.
Supported generated directives:
headerlist:<table_name>table:<table_name>
Run it from repository root:
python scripts/automation/generate_pages.py- Build index pages from
indexesconfig inschema/automation.json. - Auto-build default entity indexes for every entity table found in
data/*.csv.- Output pattern:
notes/indexes/all_<entity_table>.md - Format: markdown table without ID columns;
nameis linked to the entity note;*_idvalues are resolved to linked*_namevalues
- Output pattern:
- The same table rendering rules apply to explicit
entity_tableindexes declared inschema/automation.json. - Keep output deterministic (stable ordering).
- Remove configured outputs when source tables are missing or empty (default behavior).
- If an auto index output name collides with an explicit
schema/automation.jsonindex output, the explicit config wins and the auto index is skipped.
Run it from repository root:
python scripts/automation/build_indexes.pyExample index outputs when matching config and data exist:
notes/indexes/all_programs.mdnotes/indexes/program_mentors.mdnotes/indexes/program_mentees.md
- Verify foreign-key IDs exist.
- Verify wiki links target valid IDs.
- Verify display columns (if used) match referenced names.
- Verify required columns exist.
- Verify generated block markers are structurally valid.
Validation should report:
- errors: fail CI/commit
- warnings: pass with review recommendations
Run it from repository root:
python scripts/quality/validate.pyNever rewrite manual prose outside explicit generated blocks.
Use directive-bearing markers:
<!-- GENERATED START: header -->
# Entity Name
<!-- GENERATED END -->
## Related Items
<!-- GENERATED START: list:programs -->
- [prog_career_mentorship](../programs/prog_career_mentorship.md)
<!-- GENERATED END -->
## Related Items Table
<!-- GENERATED START: table:programs -->
| id | name |
| --- | --- |
| [prog_career_mentorship](../programs/prog_career_mentorship.md) | Career Mentorship |
<!-- GENERATED END -->Only the content between markers is script-managed.
Generated links use relative Markdown paths so they resolve in local Markdown preview and on GitHub.
If older notes still contain wiki links, migrate them with:
python scripts/automation/migrate_wikilinks.pyPreferred one-command flow:
bash scripts/runtime/pkm_python.sh scripts/automation/run_all.pyEquivalent explicit sequence:
python scripts/automation/generate_pages.py
python scripts/automation/build_indexes.py
python scripts/quality/validate.pyIf validation fails, block commit until fixed.
Use versioned hooks in this repository:
bash scripts/setup/install_hooks.shThis configures:
core.hooksPath=.githooks-
.githooks/pre-committo runscripts/runtime/pkm_python.sh scripts/automation/run_all.py
Run the same validation and generation checks in CI to guarantee reproducibility for contributors.
This repository includes a GitHub Actions workflow:
.github/workflows/pkm-check.yml
It sets up Python 3.11 and runs the pipeline twice. The second run must produce no tracked file changes.
Pipeline command:
bash scripts/runtime/pkm_python.sh scripts/automation/run_all.pyIdempotency guard:
git diff --exit-code
git diff --cached --exit-codeRuntime behavior can be configured in .env:
PKM_CONDA_ENV=
PKM_PYTHON_BIN=python3-
PKM_CONDA_ENVempty: use system Python -
PKM_CONDA_ENVset: helper checks env is active or exists, then runs withconda run -n <env>
schema/automation.json is intentionally empty in the starter template.
- Baseline note generation still works from entity tables in
data/ - No
indexes: explicit index generation is skipped, but default auto entity indexes are still generated
Add index rules there when you want generated index pages.
Create a filtered public snapshot:
python scripts/export/export_public_snapshot.pyConfigure redaction IDs in:
export/private_ids.txt