Skip to content

comses/codemeticulous

 
 

Repository files navigation

Warning

codemeticulous is in an early state of development and things are subject to change. Refer to the table below to see currently supported formats and conversions.

codemeticulous is a python library and command line utility for working with different metadata standards for software. Several Pydantic models that mirror metadata schemas are provided which allows for simple validation, (de)serialization and type-safety for developers.

For converting between different standards, an extension of CodeMeta, called CanonicalCodeMeta, is used as a canonical data model or central "hub" representation, along with conversion logic back and forth between it and supported standards. This design allows for conversion between any two formats without needing to implement each bridge. CodeMeta was chosen as it is the most exhaustive and provides crosswalk definitions between other formats. Still, some data loss can occur, so some extension is needed to fill schema gaps and resolve abiguity. Note that CanonicalCodeMeta is not a proposed standard, but an internal data model used by this library.

Feature Roadmap

Schema Pydantic model Backward-compatible with[1] Convert to Convert from
CodeMeta v3 [2] v2
Datacite 4.6 4.0, 4.1, 4.2, 4.3, 4.4, 4.5
Citation File Format 1.2.0
GitHub Repository 2022-11-28
Zenodo?
...
[1]

Lists the versions that can be safely used as input. Output will always use the specified version. For example, the CodeMetaV3 model will accept v2 property names and automatically change them to v3 equivalents.

[2]

The CodeMeta model is currently implemented as a pydantic v1 model, due to a heavy reliance on pydantic_schemaorg which has not been fully updated.

Installation

$ pip install git+https://github.com/sgfost/codemeticulous.git

Usage

As a command line tool

$ codemeticulous convert --from codemeta --to cff codemeta.json > CITATION.cff
$ codemeticulous validate --format cff CITATION.cff

As a python library

from codemeticulous.codemeta import CodeMeta, Person
from codemeticulous import convert

codemeta = CodeMeta(
  name="My Project",
  author=Person(givenName="Dale", familyName="Earnhardt"),
)

# commit kwarg is an override that can be used to insert
# a custom field into the resulting metadata after conversion
cff = convert("codemeta", "cff", codemeta, commit="abcdef123456789")

print(codemeta.json(indent=True))
# {
#   "@context": "https://w3id.org/codemeta/3.0",
#   "@type": "SoftwareSourceCode",
#   "name": "My Project",
#   "author": {"@type": "Person", "givenName": "Dale", "familyName": "Earnhardt"}
# }

print(cff.yaml())
# authors:
# - family-names: Earnhardt
#   given-names: Dale
# cff-version: 1.2.0
# message: If you use this software, please cite it using the metadata from this file.
# title: My Project
# type: software
# commit: abcdef123456789

Development

codemeticulous uses uv for project management. The following assumes that you have installed uv.

Get started by cloning the repository and setting up a virtual environment

$ git clone https://github.com/sgfost/codemeticulous.git
$ cd codemeticulous
$ uv sync --dev
$ source .venv/bin/activate

Run tests

$ uv run pytest tests

About

practical validation and conversion between software metadata standards with pydantic

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.6%
  • Shell 0.4%