State management
Durable cursor storage, restart safety, backfills.
State persistence keeps a small per-pipeline JSON file on disk. The input reads it at startup and writes it after every successful batch — see State persistence for the YAML config. This page is about operating it.
Storage location
| Environment | Recommended storagePath |
|---|---|
| Local dev | ./.cannectors-state (add to .gitignore) |
| Single host, systemd | /var/lib/cannectors/state (owned by the cannectors user) |
| Kubernetes | A PersistentVolumeClaim mounted at e.g. /state |
| Fly.io / Railway | An attached volume mounted at /state |
| Container with ephemeral disk | Don't. State must survive container replacement. |
If the storage disappears between runs, the pipeline restarts from scratch — possibly re-processing already-shipped records.
What the file looks like
One file per pipeline, named after pipeline.name:
/var/lib/cannectors/state/sync-orders.json{
"id": 4821,
"timestamp": "2026-04-21T12:34:56Z"
}Only the cursor types your statePersistence config enables show up.
Don't hand-edit while the pipeline is running.
Backups
The state file is small (< 1 KB). Snapshot it with whatever you use for the rest of the volume:
# Linux, daily
sudo cp /var/lib/cannectors/state/*.json /backup/cannectors-state/$(date +%F)/For Kubernetes, a VolumeSnapshot on the PVC works the same way.
Restoring a state file is just dropping it back in place — the next run reads it.
Backfills
To re-process everything from scratch, delete the state file:
sudo systemctl stop cannectors-orders
sudo rm /var/lib/cannectors/state/sync-orders.json
sudo systemctl start cannectors-ordersTo re-process from a specific point, seed the state file:
sudo systemctl stop cannectors-orders
sudo tee /var/lib/cannectors/state/sync-orders.json <<'EOF'
{ "timestamp": "2026-01-01T00:00:00Z" }
EOF
sudo chown cannectors:cannectors /var/lib/cannectors/state/sync-orders.json
sudo systemctl start cannectors-ordersMonitoring
Two things are worth alerting on:
- State file age — if the file hasn't been touched in N CRON
ticks, the pipeline isn't making progress.
stat -c %Y state-file.jsongives the mtime in seconds. - State file growth — if the cursor value isn't advancing, the source might be returning empty pages even though new data exists. Diff between two snapshots tells you.
Idempotence at the destination
State persistence is best-effort restart-safety, not exactly-once
delivery. After a crash mid-batch, the runtime re-fetches and
re-sends some records that the destination already saw. Design your
destination to handle that — INSERT … ON CONFLICT DO UPDATE,
idempotency keys, or a dedup layer.