Runbook: CA key ceremony (m-of-n)
Generating or rotating a Certificate Authority key is the most consequential operation in trstctl: whoever controls a CA key can mint trust. trstctl gates CA-key operations behind an m-of-n key ceremony — the key is created only after a configured number of distinct custodians approve — so no single operator can unilaterally stand up or rotate a CA.
Maturity note. The m-of-n ceremony is implemented and tested as library code (
internal/ca/hierarchy); it is driven today through the Go API, not yet a served REST/UI flow. The assembled issuing CA's key is now persisted, sealed at rest (R3.2): the signer reloads it after a restart, so the CA is not silently rotated (see Configuration → Signer and disaster recovery). HSM/KMS-backed custody (vs. the local sealed key file) and a served, m-of-n break-glass flow remain future work (see Current limitations and the incident-response runbook). This runbook documents the real mechanism and the operating procedure around it.
The model
A ceremony has a purpose (what key it authorizes — a root, an intermediate, or
a rotation) and a threshold m: the number of distinct custodian approvals
required. Custodians approve independently; the CA-key operation is refused until
quorum is reached (ErrQuorumNotMet), and refused again for any operation whose
ceremony has not reached its threshold.
The mechanism, in the hierarchy manager:
StartCeremony(tenant, purpose, threshold)opens an m-of-n ceremony and returns its id.Approve(tenant, ceremonyID, custodian)records one custodian's approval and returns the running approval count. Approvals are de-duplicated per custodian.CreateRoot/CreateIntermediate/Rotate/CrossSignare gated on quorum: each takes aceremonyIDand calls the internalrequireQuorumcheck first, returningErrQuorumNotMet (k of m approvals)until m distinct custodians have approved. On success the ceremony is marked complete and cannot be reused. Cross-signing is gated because it, too, extends trust (it mints a CA certificate under your signing CA).
The ceremony and its approvals are tenant-scoped rows under row-level security
(AN-1): ca_key_ceremonies (with the threshold) and ca_ceremony_approvals.
Procedure: standing up a new CA
- Convene the custodians. Choose n trusted custodians and a threshold m (e.g. 3-of-5). More than half is the usual floor; pick m so that losing a custodian does not block operations but a single compromised custodian cannot reach quorum alone.
- Open the ceremony for the purpose (root or intermediate) with the chosen threshold m.
- Collect approvals. Each custodian independently reviews the request (purpose, key parameters, the parent CA for an intermediate) and approves. Record who approved and when — the approvals are auditable.
- Create the CA. Once m distinct custodians have approved, run the create
(root / intermediate). Before quorum the operation fails closed with
ErrQuorumNotMet. - Distribute trust. Publish the new CA certificate to relying parties; for an intermediate, verify the chain to its parent.
- Record the ceremony in your change-management system alongside the audit trail.
Procedure: rotating a CA
Rotation is the same ceremony, with purpose = rotation:
- Open a rotation ceremony with threshold m and collect m approvals.
- Run
Rotatefor the CA; it is refused until quorum. - If your hierarchy requires cross-signing the new CA, open a separate
cross-sign ceremony and collect its m approvals —
CrossSignis refused until quorum, exactly like the create/rotate operations. Then distribute the new CA and renew issuance under it. - Retire the old key per your policy (and per the incident-response runbook if the rotation is compromise-driven).
Custodian hygiene
- Custodians should be distinct people with independent credentials; do not let one operator hold multiple custodian identities.
- Choose m and n so the loss of one custodian is recoverable but a single compromise cannot mint trust.
- Treat every approval as a logged, attributable action (it is recorded against the ceremony).
See Current limitations for what is served by the binary today versus driven through the Go API, and Disaster recovery for CA-key loss handling.