Skip to content

Schema Contract

Salvador Banderas Rovira edited this page Mar 15, 2026 · 4 revisions

This page defines the minimum contract that all GitPKM implementations should follow.

1. ID Specification

Stable IDs are mandatory in all source tables and notes.

Allowed pattern

^[a-z][a-z0-9]*(?:_[a-z0-9]+)*$

Examples:

  • person_alex
  • org_learning_hub
  • prog_career_mentorship

Rules:

  • lowercase only
  • segments separated by _
  • no spaces
  • no hyphens
  • no uppercase
  • no leading numeric character

2. Required Columns

Each CSV table must include an id column.

Entity table minimum

  • required: id, name
  • optional: additional metadata columns

Relationship table minimum

  • required: id, one or more foreign-key columns ending in _id
  • optional: relationship metadata (role, year, region, etc.)

Example:

id,person_id,program_id,role
part_alex_mentee,person_alex,prog_career_mentorship,mentee

3. Foreign-Key Naming Convention

All foreign keys must:

  • end with _id
  • reference IDs in another table
  • use the same ID format as primary IDs

Examples:

  • organization_id -> organizations.csv:id
  • person_id -> people.csv:id
  • program_id -> programs.csv:id

4. Markdown Contract

Entity notes should include frontmatter with matching ID:

---
id: person_alex
type: person
---

Recommended file naming:

  • notes/<entity_type_plural>/<id>.md

Frontmatter type should use an allowed entity type inferred from entity tables.

Example with people.csv, organizations.csv, programs.csv:

  • person
  • organization
  • program

5. Link Contract

Accepted entity link forms in Markdown:

  • relative Markdown link: [entity_id](relative/path/to/entity_id.md)

Generators should emit links as relative Markdown paths so links resolve both locally and on GitHub.

6. Generated Content Contract

Scripts may only write within explicit generated markers:

<!-- GENERATED START -->
...
<!-- GENERATED END -->

Any content outside markers is user-authored and must not be modified by generators.

7. Validation Error Policy

Validation should classify failures as:

Error (must fail CI/commit)

  • invalid ID format
  • missing required column
  • duplicate primary ID
  • unresolved foreign key
  • unresolved entity link target
  • frontmatter ID mismatch when note exists
  • invalid frontmatter type for entity note
  • invalid generated marker structure (nested, unmatched start/end)

Warning (may pass CI, review recommended)

  • note missing for an existing entity row
  • orphan note with no matching row
  • display column mismatch (for optional name helper columns)

8. Determinism Contract

Generators and validators should be deterministic:

  • stable output ordering
  • idempotent reruns (running twice should not create extra diffs)

9. Compatibility Policy

When changing schema rules:

  1. update this contract first
  2. update validators and scripts
  3. provide migration notes for existing datasets

Clone this wiki locally