Warning
This project is still in development. The format on disk might change at any time and there might be no way to convert data from one version to another. Also, expect some bugs.
cling-sync is a client-side encrypted, revisional archival storage system.
The main goal is to provide a place where you can put all your data without worrying about losing it. Everything you put in once stays there forever.
-
Repository: The place where all data is stored after it has been encrypted.
-
Workspace: The local working copy of the repository files.
-
Merge: The process of applying all changes from the repository to the workspace and vice versa.
-
Revision: A snapshot of the repository at a specific point in time, usually created by a
mergeoperation.
By default, only changes to the file contents are tracked. The file's metadata (ownership, mtime,
and mode) are stored in the repository, but modifications to these metadata are not detected in
consecutive merges unless the --chown, --chmod, or --chtime flags are used.
When merging new files from the repository, mtime and mode are restored from the repository but not ownership. This is because ownership is highly dependent on the user's environment whereas mtime and mode are not.
This way of tracking modifications is best suited for the main use case of cling-sync: making sure user data is never lost without the complications of a "real" backup system.
Currently, the main focus is on supporting MacOS and Linux. It should work on Windows, but is not tested at the moment.
The fact that everything but the CLI is written in plain Go (no CGO) and uses only the
standard library with a few select golang.org/x dependencies should make it highly portable.
Build the Command Line Interface (CLI) tool:
Install Go version 1.24.2 or later and run:
./build.sh build cli
Run the CLI tool:
./cling-sync <command>
See ./cling-sync --help for more information.
cling-sync init /path/to/repository
This will create a new repository at /path/to/repository where all encrypted data is stored.
Additionally, a .cling directory is created in the current directory that ties the repository
to this directory.
Examine /path/to/repository/.cling/repository.txt to learn how to backup the encryption keys.
cling-sync attach /path/to/repository /path/to/local/directory
This will create a new workspace at /path/to/local/directory that is connected to the repository
at /path/to/repository.
cling-sync attach --path-prefix my/path/prefix/ /path/to/repository /path/to/local/directory
This will create a new workspace at /path/to/local/directory that is connected to the repository
at /path/to/repository and will only show files that are inside the path prefix my/path/prefix/.
All operations with the exception of cp will be limited to the path prefix.
cling-sync security save-keys
Store the encryption keys of the repository in .cling/workspace/security/keys.enc.
The file is encrypted with a random key that is securely stored in the system's keyring.
Notes an MacOS:
You might need to unlock the keychain with
security unlock-keychain ~/Library/Keychains/login.keychain-db
Notes on Linux:
This feature uses secret-tool to store the random key in the system's keyring.
You can unlock it (Gnome) with:
echo '\n' | gnome-keyring-daemon --unlock
cling-sync merge
This will copy all new or modified files from the repository and delete all files that are not in the repository's latest revision. After this, changes from the local workspace are committed to the repository. If there are conflicts, the user is asked to resolve them.
cling-sync reset head
This will reset the workspace to the latest repository revision.
cling-sync status
cling-sync log 'path/to/somewhere/**/*.txt' --status
Show all revisions that contain a path that matches the pattern and show all paths that were added, updated, or deleted.
cling-sync check
Scan all revisions and verify the repository's integrity. cling-sync check --data will also check
every block for data corruption. This will take a while. :-)
cling-sync serve --address 127.0.0.1:4242 /path/to/repository
This will start a HTTP server on port 4242 that serves the repository at /path/to/repository.
cling-sync attach http://127.0.0.1:4242 /path/to/workspace
This will attach the repository at 127.0.0.1:4242 to the workspace at /path/to/workspace.
cling-sync respects .gitignore and .clingignore files. The syntax is the same as for
git.
Note
There is one difference between how Git and cling-sync handle ignore files. If you add
a pattern or path to a .clingignore or .gitignore file and merge it into the repository,
all matching files will be removed from the current revision.
No files will be removed from the workspace.
And as always, older revisions will still contain the files.
Wasm support is a main focus of this project.
Play around with the Wasm example included in this repository. First, serve a repository:
cling-sync serve --cors-allow-all --address 127.0.0.1:4242 /path/to/repository
Then, build the Wasm example:
./build.sh wasm dev
Finally, open the example in your browser:
open http://127.0.0.1:8000/example.html
Using the standard Go compiler (default), the Wasm binary is quite huge (about 5MB).
To compile using TinyGo, use the --optimize flag:
./build.sh wasm dev --optimize
This reduces the binary size to about 600KB, which is okay for now.
The repository cryptography relies on these values you can find in .cling/repository.txt:
-
An encrypted 32-byte Key Encryption Key (KEK) that is the root key used to derive all other Data Encryption Keys (DEK).
-
A 32 byte Block ID HMAC Key that is used to sign the block id based on the content.
-
A 32 byte User Key Salt that is used in the Key Derivation Function (KDF) to derive an encryption key to encrypt/decrypt the KEK.
All of these values are not strictly secret - without the passphrase, data cannot be decrypted.
| Purpose | Algorithm | Notes |
|---|---|---|
| Key derivation | Argon2id | 5 iterations, 64MB RAM, 1 thread |
| Encryption (all data) | XChaCha20-Poly1305 (AEAD) | Nonce-misuse resistant; 24B nonce, 16B tag |
| Block ID generation | HMAC-SHA256 | Uses per-repo secret HMAC key |
The flow to arrive at the KEK:
-
The user provides their passphrase
-
The Argon2id KDF is used to derive a key from the passphrase and the User Key Salt.
-
That key is then used to decrypt the encrypted KEK.
-
The KEK is then used to decrypt the encrypted Block ID HMAC Key.
A block ID is calculated like this: HMAC(SHA256(blockContent), BlockIDHMACKey) where BlockIDHMACKey
is the Block ID HMAC Key stored in .cling/repository.txt.
This makes blocks content addressable, but you cannot make any assumptions about the content of a block based on its block id.
File contents and all metadata are stored in blocks of up to 8MB in size. Each block is encrypted with a unique, random 32 byte Data Encryption Key (DEK). That DEK is encrypted with the KEK and stored alongside the random nonce used in the block header (see below).
If only a part of a file is modified, only that part (more or less) is stored in the repository. Block boundaries are not fixed, but are calculated using the GearCDC algorithm. Basically, the algorithm keeps a rolling hash of the content to detect a "good boundary" so that a block is at best around 2-4MB in size. Because this is based on the actual content, even changes in the middle of a file are detected and at some point, the algorithm will detect the boundaries of blocks that were not changed. This also means that for files smaller than the average block size, deduplication is not effective.
A block may be compressed if it is at least 1KB in size and the first 1KB "looks" compressable, i.e. the entropy of the data is low enough. If the compression ratio of the whole block is below 5%, the block is stored uncompressed.
The compression algorithm is Deflate with level 6.
All integer types are written as little-endian, and all strings are UTF-8 encoded.
FileMetadata is serialized to:
| Size (bytes) | Type | Field | Description |
|---|---|---|---|
| 2 | uint16 | format version | Serialization format version (0x01) |
| 4 | uint64 | ModeAndPerm | File mode and permission flags (see below) |
| (12) | timespec | MTime | File modification time |
| 8 | int64 | - MTimeSec | File modification time (seconds since epoch) |
| 4 | int32 | - MTimeNsec | File modification time (nanoseconds) |
| 8 | int64 | Size | File size |
| 32 | SHA256 | FileHash | Hash of the file contents |
| array | BlockIds | Block IDs of the file contents | |
| 2 | uint16 | - Length | Number of block IDs (N) |
| 32 * N | BlockId | - BlockIds | Block IDs (N) |
| string | SymlinkTarget | The symlink target path or empty | |
| 2 | uint16 | - Length | Length of target file name (M) |
| M | uint8 | - Bytes | utf-8 encoded string |
| 4 | uint32 | UID | Optional: Owner of the file (2^31 if missing) |
| 4 | uint32 | GID | Optional: Group of the file (2^31 if missing) |
| (12) | timespec | Birthtime | Optional: File creation time |
| 8 | int64 | - BirthtimeSec | File creation time (seconds since epoch) or -1 |
| 4 | int32 | - BirthtimeNsec | File creation time (nanoseconds) or -1 |
Block is serialized to:
| Size (bytes) | Type | Field | Description |
|---|---|---|---|
| (96) | Header | Header of the block | |
| 2 | uint16 | format version | Serialization format version (0x01) |
| 8 | uint64 | Flags | Flags for the block (see below) |
| 72 | EncKey | EncryptedDEK | Block's encryption key (encrypted with KEK) |
| 4 | uint32 | DataSize | Size of the following data (N) |
| 10 | padding | Header padding to 96 bytes | |
| N | uint8 | Data | Encrypted data of the block |
This repository is self-contained and does not depend on any external tools or libraries.