Skip to content
Salvador Banderas Rovira edited this page Apr 16, 2026 · 10 revisions

GitPKM currently provides two separated workflows:

  • Local CLI editing (pkm.py)
  • Remote GitHub editing (Issue Forms + Actions)

Local CLI Editing

The local CLI provides commands for creating entity rows and relationship rows.

Current commands:

  • python pkm.py new <dataset> <name> [--id <entity_id>] [--columns col1,col2] [--set key=value]
  • python pkm.py update <dataset> <id> --set key=value
  • python pkm.py link <source_id> <target_id> [--table <relation_table>] [--id <row_id>] [--role <role>] [--set key=value]
  • python pkm.py bulk-import --input <file.csv> [--mapping <name|path|auto>] [--mappings-dir <dir>] [--validate-only] [--apply]
  • python pkm.py mappings list [--mappings-dir <dir>]
  • python pkm.py mappings validate --mapping <name|path|auto> [--input <file.csv>] [--mappings-dir <dir>]
  • python pkm.py reprocess-notes

Dataset Naming Rule

Dataset names are exact.

The tooling does not singularize or pluralize them for you.

Examples:

  • python pkm.py new person "Alex" writes to data/person.csv
  • python pkm.py new people "Alex" writes to data/people.csv
  • python pkm.py new program "Career Mentorship" writes to data/program.csv
  • python pkm.py new programs "Career Mentorship" writes to data/programs.csv

This same exact-name rule also affects:

  • generated note type
  • inferred foreign-key column names
  • default relationship row IDs

new

Create a new entity row and then run note/index automation.

Syntax:

python pkm.py new <dataset> <name> [--id <entity_id>] [--columns col1,col2] [--set key=value]

Examples:

python pkm.py new person "Alex"
python pkm.py new people "Alex"
python pkm.py new program "Career Mentorship"
python pkm.py new vip_people "Iune Banderas" --columns status,rarity
python pkm.py new people "Alex" --set email=alex@example.com --set status=active

Behavior:

  • creates data/<dataset>.csv if missing
  • requires entity tables to have at least id,name
  • when creating a missing dataset, columns from --columns are added
  • when creating a missing dataset, field keys from --set are also added as new columns
  • generates a default ID like <dataset>_<slug> unless --id is provided
  • allows setting any existing entity-table column with repeated --set key=value
  • creates a baseline note if one does not exist
  • reruns generation and index automation

New Dataset Bootstrap

When the dataset does not exist yet:

  • --columns defines initial columns
  • keys used by --set are also created as columns

When the dataset already exists:

  • --set and --columns keys must match existing columns

Example with explicit ID:

python pkm.py new person "Alex" --id person_alex

update

Update an existing entity row by ID.

Syntax:

python pkm.py update <dataset> <id> --set key=value

Examples:

python pkm.py update people people_alex --set email=alex@example.com
python pkm.py update game_disc disc_cusa --set game_title_id=game_title_13_sentinels --set status=released

Behavior:

  • requires the target dataset and ID to already exist
  • rejects unknown columns
  • supports updating in-table foreign-key columns such as *_id
  • reruns generation and index automation when data changes

link

Create a relationship row between two existing entity IDs.

Syntax:

python pkm.py link <source_id> <target_id> [--table <relation_table>] [--id <row_id>] [--role <role>] [--set key=value]

Examples:

python pkm.py link person_alex program_career_mentorship --table participation --role mentor
python pkm.py link person_alex org_learning_hub --table affiliation --set status=active

Behavior:

  • verifies both IDs already exist in CSV data
  • uses the exact dataset names of those IDs to derive foreign-key columns
  • writes to the explicit relationship table passed with --table, or tries to infer one existing matching table if --table is omitted
  • appends any extra fields from --role or repeated --set key=value
  • reruns generation and index automation

If the relation table does not yet exist, the CLI creates it with:

  • id
  • <source_dataset>_id
  • <target_dataset>_id
  • any extra fields you passed

Example created row:

id,person_id,program_id,role
participation_person_alex_program_career_mentorship,person_alex,program_career_mentorship,mentor

When Link Tables Are Useful

Use direct foreign keys on entity tables for simple one-to-one and one-to-many relationships.

Use link tables when the relation itself needs data.

Typical cases:

  • many-to-many relationships
  • role/date/status/source/confidence metadata
  • relationships that need their own stable ID
  • relationships involving more than two entities

reprocess-notes

Re-render all note headers and generated blocks from current CSV tables.

Syntax:

python pkm.py reprocess-notes

Use this after schema or data changes when you want to refresh note frontmatter and generated sections in one pass.

bulk-import

Import rows from a source CSV using a reusable JSON mapping.

Syntax:

python pkm.py bulk-import --input <file.csv> [--mapping <name|path|auto>] [--mappings-dir <dir>] [--validate-only] [--apply]

Examples:

python pkm.py bulk-import --input ./import/people_programs.csv --mapping people_programs
python pkm.py bulk-import --input ./import/people_programs.csv --mapping auto --mappings-dir ./schema/import_mappings
python pkm.py bulk-import --input ./import/people_programs.csv --mapping ./schema/import_mappings/people_programs.json --apply

Behavior:

  • default mode is dry-run (no files are written)
  • --apply writes CSV changes and reruns full automation
  • supports reusable mappings with ordered entity creation and optional relation rows
  • supports ref:<entity_key> in mapping values so relation foreign keys can reuse IDs generated in the same source row
  • upserts by id (insert, update, or unchanged)
  • --mapping can be a mapping stem, a direct file path, or auto
  • --mappings-dir defaults to schema/import_mappings
  • auto selects the single mapping whose match.source_columns and referenced source columns fit the input CSV header set
  • --validate-only validates the mapping and source CSV compatibility without processing rows

Mapping files:

  • working files: schema/import_mappings/*.json
  • example files: schema/import_mappings/*.json
  • machine-readable contract: schema/import_mapping.contract.schema.json

Supported mapping value forms:

  • source column name, for example person_email
  • ref:<entity_key>, for example ref:person
  • const:<literal>, for example const:active

Supported ID template tokens:

  • {column_name} (slugified source column value)
  • {slug:column_name} (explicit slug token)
  • {ref:entity_key} (generated ID from current row context)

Recommended pattern:

  • put one import mapping per file
  • name the file after the import shape, such as people_programs.json
  • include match.source_columns so auto can pick the right file when there is exactly one match

mappings list

List all mapping files in the mappings directory and show whether each one is valid.

Syntax:

python pkm.py mappings list [--mappings-dir <dir>]

Example:

python pkm.py mappings list --mappings-dir ./schema/import_mappings

mappings validate

Validate one mapping file by name or path. Optionally validate against a real source CSV header.

Syntax:

python pkm.py mappings validate --mapping <name|path|auto> [--input <file.csv>] [--mappings-dir <dir>]

Examples:

python pkm.py mappings validate --mapping people_programs
python pkm.py mappings validate --mapping people_programs --input ./import/people_programs.csv
python pkm.py mappings validate --mapping auto --input ./import/people_programs.csv --mappings-dir ./schema/import_mappings

Note:

  • --mapping auto requires --input so the command can resolve a unique mapping by source headers.

Mapping Contract

Mapping JSON minimal contract:

  • entities: list (required)
  • relations: list (required)

Entity object contract:

  • table: target entity dataset/table (required)
  • name_column: source CSV column for entity name (required)
  • id_template: ID template string (required)
  • key: optional reference key (defaults to table)
  • fields: optional object of <target_column>: <source_spec>

Relation object contract:

  • table: target relation dataset/table (required)
  • fields: object of <target_column>: <source_spec> (required, non-empty)
  • id_template: optional ID template string

Optional matcher object:

  • match.source_columns: optional list of source columns used by --mapping auto
  • required_csv_columns: optional alias for the same header-matching behavior

Source spec forms:

  • <column_name> reads directly from source CSV
  • ref:<entity_key> uses an entity ID resolved earlier in the same row
  • const:<literal> writes a constant value

Template token forms:

  • {column_name} slugifies the source column value
  • {slug:column_name} explicit slug token
  • {ref:entity_key} injects a resolved entity ID from current row context

Remote GitHub Editing

Use GitHub Issue Forms when editing from GitHub web UI.

GitHub Issue Forms

For remote editing in GitHub, use issue forms instead of manual CSV edits.

Available forms:

  • Add Entity
  • Add Relationship
  • Update Entity

Workflow behavior:

  • each form adds a label (pkm:add-entity or pkm:add-link)
  • if labels do not exist yet, the workflow creates them automatically
  • each issue request is processed once on open (no duplicate run from self-applied labels)
  • automation is repo-gated by file .github/pkm-issue-ops.enabled
  • request type is detected by label, issue title prefix, or form body sections
  • the workflow parses the issue body
  • it runs the equivalent CLI command (pkm.py new, pkm.py link, or pkm.py update)
  • the workflow runs the full automation pipeline (generate_pages, build_indexes, update_readme_directory, validate)
  • changes are committed directly to the default branch
  • processed issues are automatically closed
  • if request type cannot be recognized, the workflow adds an explanatory issue comment instead of silently skipping
  • only repository maintainers (OWNER, MEMBER, COLLABORATOR) may trigger the automation

Repository targeting:

  • Issue forms are visible only from the repository default branch.
  • Official/code repository: keep the issue templates off the default branch.
  • Implementation repositories: put the issue templates on the default branch if you want the forms visible there.
  • If you prefer a separate implementation branch, you can make that branch the default branch for that repository.
  • The .github/pkm-issue-ops.enabled file only gates automation, not form visibility.

Recommended setup used by this project:

  • Keep main as the official branch without issue-form templates.
  • Keep issue-ops as the implementation branch with issue templates and workflow files.
  • Set the implementation repository default branch to issue-ops so forms are visible there.

Sync workflow:

  • Merge or rebase issue-ops on top of main after each stable release.
  • Prefer merge if you want lower maintenance risk for workflow/template history.
  • Prefer rebase if you require linear history and can resolve conflicts consistently.

Permission requirement:

  • workflow needs write access to repository contents and issues
  • if default branch is protected, allow this workflow to push or use an exception policy

Runtime note:

  • workflow sets FORCE_JAVASCRIPT_ACTIONS_TO_NODE24=true and uses actions/github-script@v8.

Field rules:

  • Additional Fields must be one key=value pair per line
  • for existing datasets, keys must match existing CSV columns
  • for new datasets, columns can be defined via Dataset Columns and/or derived from Additional Fields keys
  • source and target IDs in relationship forms must already exist

Clone this wiki locally