Technical design

This page is a technical overview with a security bias. It’s about the procedures and design decisions behind Safeparts, not a tour of the codebase.

If you are new, start with Getting started and Security.

What Safeparts is (and is not)

Safeparts is a k-of-n recovery tool built on Shamir-style secret sharing.

With fewer than k shares, the secret remains unrecoverable.
With any k shares from the same set, recovery succeeds.

Safeparts is not a storage system. Your storage and distribution plan is the real security boundary.

End-to-end data flow

Safeparts deliberately layers checks so common mistakes fail early (wrong shares, typos, mixed sets). On top of that, you can add a second factor (a passphrase).

Split flow
---------
secret bytes
  |
  | (optional) passphrase protection
  |   - Argon2id(passphrase, salt, parameters) -> 32-byte key
  |   - ChaCha20-Poly1305(key, nonce) -> ciphertext
  |
  | data_to_split = plaintext OR ciphertext
  |
  | integrity tag = BLAKE3(data_to_split)   (32 bytes)
  | tagged = data_to_split || integrity tag
  |
  | Shamir split(tagged, k, n, set_id) -> n shares
  |
  | wrap each share into a self-describing SharePacket
  |
  | encode for transport/storage (base64url | base58check | mnemonic)

Combine flow
-----------
decode shares -> SharePackets
  |
  | validate metadata consistency (set_id, k, n, x, crypto params)
  |
  | Shamir combine -> tagged bytes
  |
  | split tagged bytes into (data_to_split, integrity tag)
  | verify BLAKE3(data_to_split) == integrity tag
  |
  | if encrypted:
  |   Argon2id(passphrase, salt, parameters) -> key
  |   ChaCha20-Poly1305 decrypt(key, nonce) -> secret
  |
  +-> output secret bytes

Data structure of a split

Safeparts treats the input as bytes and splits it byte-by-byte. The value that actually gets shared is:

data_to_split (plaintext or ciphertext)
followed by a 32-byte BLAKE3 integrity tag

Let the tagged payload be L bytes. For each byte position i:

Interpret the byte as the constant term of a degree-k-1 polynomial in GF(256).
Sample k-1 random coefficients (one polynomial per byte index).
Evaluate the polynomial at x = 1..n to produce n output bytes.

Each share payload is the concatenation of its y values across all L positions. So every share is the same length as the tagged payload. The share packet stores its own x value, along with k, n, and set_id, so reconstruction can pick any k shares and interpolate each byte independently.

Safeparts uses Shamir-style secret sharing over the finite field GF(256), applied byte-wise.

Practical implications:

Works for any binary secret: the secret is just bytes (files, keys, seed phrases, etc.).
Share count limit: GF(256) has 256 elements; Safeparts reserves x = 0 for reconstruction, so 1 <= n <= 255.
No structure leakage: fewer than k shares provides no usable information about the secret bytes.

Why GF(256) instead of a big prime field

Many Shamir implementations treat the secret as a big integer mod a prime. Safeparts uses GF(256) so the scheme applies directly to bytes without extra packing rules.

The trade-off is the practical limit of 255 shares per split. That is usually far beyond what a real recovery plan needs.

Safeparts shares are not raw (x, y) points. Each share is wrapped into a versioned, self-describing packet so:

Shares from different splits are harder to mix by accident.
The decoding layer can reject truncated/garbled data.
Optional encryption parameters travel with the share set.

Conceptual binary layout (simplified):

SharePacket
----------
magic      : "SMN1"
version    : u8
flags      : u8   (e.g. encrypted)
k, n, x    : u8, u8, u8
set_id     : 16 bytes (random identifier for the share set)

if encrypted:
  salt         : 16 bytes
  nonce        : 12 bytes
  argon mem    : u32 (KiB)
  argon time   : u32
  argon par    : u32

payload_len: u32
payload    : bytes   (the share data)

Security note: metadata (like k, n, and payload length) is not secret. The secrecy comes from threshold sharing and (optionally) encryption.

Integrity: why a BLAKE3 tag exists

Shamir sharing gives confidentiality (until k shares), but it does not reliably detect user mistakes. With the wrong share, interpolation still returns some output.

Safeparts appends a 32-byte integrity tag:

On split: compute BLAKE3(data_to_split) and append it.
On combine: recompute and compare; if it does not match, recovery fails.

What this integrity tag is for:

Detecting corrupted shares (bad copy/paste, damaged storage).
Detecting mixed sets (shares from different splits).
Validating a reconstructed ciphertext without knowing the passphrase.

What it is not for:

It is not a keyed MAC and does not provide authenticity against an attacker who already has k shares. (If an attacker has k shares, confidentiality is already lost.)

Why BLAKE3 (instead of SHA-256, etc.)

BLAKE3 is a modern, fast general-purpose hash:

Fast on basically all CPUs, including low-power devices.
Parallel-friendly and incremental.
Strong, conservative design lineage (BLAKE2-based).

SHA-256 would also work as a “checksum after reconstruction”. BLAKE3 is mostly a performance and ergonomics choice here.

Optional passphrase protection

Safeparts can encrypt the secret before secret sharing. That makes recovery a two-factor requirement:

something you have: at least k shares
something you know: the passphrase

Why encrypt-then-split

Encrypting once and then splitting has two useful properties:

You can verify share correctness (via the BLAKE3 tag) without the passphrase.
You only decrypt after you are confident the share set is consistent.

Why Argon2id for key derivation

If an attacker obtains k shares, they can reconstruct the encrypted payload and run an offline passphrase-guessing attack. The KDF exists to make that attack expensive.

Argon2id is widely recommended because it is:

Memory-hard (more expensive on GPUs/ASICs than PBKDF2).
Resistant to common side-channel concerns (compared to Argon2d in hostile environments).
Tunable (time, memory, parallelism) so you can raise the cost over time.

Safeparts stores the Argon2 parameters in the share packet so future versions can change defaults without breaking old shares.

Current defaults (subject to change) are roughly:

memory: 64 MiB
time cost: 3
parallelism: 1

Why ChaCha20-Poly1305 for encryption

ChaCha20-Poly1305 is an AEAD construction (authenticated encryption with associated data). It provides:

Confidentiality: the secret remains hidden without the key.
Integrity/authenticity of ciphertext: wrong passphrases or tampering cause decryption failure.

It is commonly chosen for cross-platform tools because it is fast and constant-time on systems without AES acceleration.

AES-GCM is also a strong, standard AEAD. ChaCha20-Poly1305 is selected here mainly for consistent performance and a “works well everywhere” feel.

Encoding layers (human and machine safety)

Encodings are about operational reliability: can you store and later re-enter the share correctly?

base64url: compact, machine-friendly.
base58check: avoids ambiguous characters and adds a checksum.
mnemo-words: word-based with CRC16 to catch many transcription errors.
mnemo-bip39: BIP-39-valid word sequences with framing for multi-part packets.

See Encodings for selection guidance.

Recommended security procedures

Split procedure (creating shares)

Choose k and n based on people and locations you can realistically coordinate under stress.
Use a clean environment:
- Prefer offline.
- Disable clipboard history and screen recording if possible.
- Avoid shells or tools that log input.
If using a passphrase:
- Use a high-entropy passphrase.
- Avoid passing it on the command line; prefer a file input where supported.
Generate shares.
Distribute shares across independent failure domains (people/devices/locations).
Maintain a runbook that lists who holds which share and how to contact them.
Do a practice recovery (ideally with a synthetic secret first, then with the real plan).

Recovery procedure (combining shares)

Collect at least k shares.
Decode them in a controlled environment.
Combine and check integrity (Safeparts does this automatically).
If encrypted, decrypt using the passphrase.
Treat the recovered secret as sensitive output:
- Avoid saving to disk unless necessary.
- Rotate/replace the secret if recovery was performed in an untrusted environment.

Compromise and rotation

If you suspect any share was copied, photographed, or exfiltrated, assume that share is compromised.
If enough shares may be compromised to reach k, assume the secret is compromised.
Best practice is to reconstruct the secret, rotate it (if possible), and re-split into a new set.

Limitations / non-goals

No verifiable secret sharing (VSS): a malicious share-holder can still provide a wrong share and cause recovery to fail.
No share refresh/rotation without reconstructing the secret.
No protection if a legitimate coalition obtains k shares.
Passphrase security depends on passphrase strength; Argon2id raises the cost but cannot prevent weak passphrases.

For the human side of planning, see Use cases and Security.