Observability

Netsy exposes Prometheus-compatible metrics and structured logs for monitoring cluster health, debugging failures, and alerting on degradation.

The observability design is intended to explain:

the Node’s current role and state
if the Primary is using sync or quorum writes
whether Replicas are healthy enough for quorum
if startup, catchup, draining, or election has stalled
whether object storage, replication, or compaction is the bottleneck

Metrics

Naming and instrumentation rules:

All metrics use the netsy_ prefix.
Use base units in metric names e.g. durations use _seconds, sizes use _bytes.
Use counters for cumulative events, gauges for current state, histograms for latency/size distributions.
Keep labels bounded and low-cardinality. Avoid labels containing keys, revisions, object names, addresses, or error strings.
Role-specific metrics use netsy_primary_, netsy_elector_, or netsy_replica_ prefixes and are exposed via a custom prometheus.Collector that only emits them while the Node is currently in that role.
When a Node leaves a role, role-specific metrics disappear from scrape output. They are not set to 0.
Do not rely on gauges for short-lived workflows that may complete between scrapes. Prefer counters, histograms, and structured logs for loading stages, preflight work, elections, and flushes.
Standard gRPC server interceptor metrics (e.g. grpc_server_handled_total, grpc_server_handling_seconds_bucket) should be registered via go-grpc-middleware or equivalent for per-RPC observability across both Client and Peer gRPC servers. These complement the Netsy-specific metrics and are not documented individually here.

Node State

Each state metric is a gauge vector with a state label. Exactly one label value is 1 at any time and all others are 0.

Metric	Type	Labels	Description
`netsy_state_health`	Gauge	`state`	Current Health State. Values: `loading`, `healthy`, `degraded`.
`netsy_state_elector`	Gauge	`state`	Current Elector State. Values: `follower`, `leader`.
`netsy_state_primary`	Gauge	`state`	Current Primary State. Values: `replica`, `starting`, `active`, `draining`.
`netsy_process_start_time_seconds`	Gauge		Unix timestamp when this Netsy process started.
`netsy_info`	Gauge	`version`, `cluster_id`, `node_id`, `quorum_config`	Build and configuration info. Always `1`. Useful for join-enriching dashboards and filtering by cluster or quorum mode.

Revisions

Metric	Type	Description
`netsy_latest_revision`	Gauge	Highest revision present in this Node’s local SQLite database.
`netsy_committed_revision`	Gauge	Current `committed_revision` on this Node.
`netsy_compaction_revision`	Gauge	Latest accepted compaction revision on this Node.
`netsy_primary_object_storage_revision`	Gauge	Highest revision known by the current Primary to be durably written to object storage.

Startup, Catchup, and Draining

These metrics make long-running phases visible beyond the high-level state gauges.

Metric	Type	Labels	Description
`netsy_loading_stage_duration_seconds`	Histogram	`stage`, `result`	Duration of individual loading stages.
`netsy_loading_restarts_total`	Counter	`reason`	Number of times the loading flow restarts.
`netsy_local_db_rebuilds_total`	Counter	`reason`	Number of times a Node discards or rebuilds local database state and starts over from snapshot, chunks, or a fresh schema.
`netsy_primary_preflight_stage_duration_seconds`	Histogram	`stage`, `result`	Duration of Primary preflight stages.
`netsy_primary_drain_duration_seconds`	Histogram	`result`	Time spent draining before stepping down or exiting.
`netsy_primary_chunk_buffer_flushes_total`	Counter	`trigger`, `result`	Chunk buffer flush attempts. `trigger` values: `size`, `age`, `draining`, `rollback`, `manual`.
`netsy_primary_chunk_buffer_flush_duration_seconds`	Histogram	`trigger`, `result`	Chunk buffer flush duration.

Client API (Read and Write Proxy)

These metrics cover the Client API surface, including range (read) requests served directly by any Node and write requests proxied by Replicas to the Primary.

Metric	Type	Labels	Description
`netsy_client_requests_total`	Counter	`kind`, `result`	Client API requests handled by this Node. `kind` is `range`, `txn`, `put`, `delete`, `compaction`, or `lease`. `result` is `success` or `error`.
`netsy_client_request_duration_seconds`	Histogram	`kind`	Client API request duration.
`netsy_replica_proxy_requests_total`	Counter	`kind`, `result`	Write requests proxied by this Replica to the Primary.
`netsy_replica_proxy_request_duration_seconds`	Histogram	`kind`	Duration of proxied write requests, measured from the Replica’s perspective (includes network round-trip to Primary).

Write Path

The Primary chooses between synchronous object storage writes and quorum writes based on Replica health and quorum configuration.

Metric	Type	Labels	Description
`netsy_primary_write_path`	Gauge	`path`	Current write path. Exactly one label value is `1`. Values: `sync`, `quorum`.
`netsy_primary_quorum_rollbacks_total`	Counter	`reason`	Number of quorum transaction rollbacks. Reasons include `receipt_timeout` and `insufficient_receipts`.
`netsy_primary_write_transactions_total`	Counter	`path`, `result`	Total write transactions attempted by the Primary.
`netsy_primary_write_duration_seconds`	Histogram	`path`, `result`	End-to-end write transaction duration.
`netsy_primary_required_receipts`	Gauge		Current Replica Receipt threshold required for quorum writes.
`netsy_primary_healthy_replicas`	Gauge		Number of Replicas currently counted as healthy for quorum by the Primary.
`netsy_primary_receipted_replicas`	Gauge		Number of Replicas that have successfully receipted at least once and are therefore eligible to count toward quorum.

Replica Health and Replication

These metrics describe the Primary’s view of Replica health and quorum eligibility for write decisions.

Metric	Type	Labels	Description
`netsy_replica_receipt_age_seconds`	Gauge		Seconds since this Node last successfully sent a Receipt to the Primary, computed at collect-time.
`netsy_primary_replication_streams`	Gauge		Number of currently connected replication streams.

Elector Cluster View and Elections

These metrics describe the Elector’s view of registered and healthy Nodes, and the results of Primary election attempts.

Metric	Type	Labels	Description
`netsy_elector_registered_nodes`	Gauge		Number of currently registered Nodes in the Elector’s in-memory map.
`netsy_elector_healthy_nodes`	Gauge		Number of Nodes currently in `Healthy` Health State according to the Elector.
`netsy_elector_degraded_nodes`	Gauge		Number of Nodes currently in `Degraded` Health State according to the Elector.
`netsy_elector_primary_elections_total`	Counter	`result`	Primary elections run by this Node as the Elector. Values: `success`, `failure`.
`netsy_elector_primary_election_failures_total`	Counter	`reason`	Failed Primary elections by failure reason.
`netsy_elector_primary_election_duration_seconds`	Histogram	`result`	End-to-end Primary election duration.
`netsy_elector_primary_election_contacts_total`	Counter	`result`	Node contact attempts made by the Elector during Primary elections. Values: `success`, `failure`.

Chunk Buffer

Metric	Type	Description
`netsy_primary_chunk_buffer_records`	Gauge	Number of records currently in the Chunk Buffer.
`netsy_primary_chunk_buffer_bytes`	Gauge	Size in bytes of all records currently in the Chunk Buffer.
`netsy_primary_chunk_buffer_age_seconds`	Gauge	Age in seconds of the oldest unflushed record in the Chunk Buffer, computed at collect-time. `0` when buffer is empty.

Object Storage

Write metrics carry kind (chunk or snapshot) and mode (sync or async) labels so operators can distinguish client-facing sync writes from background buffer flushes. Read metrics carry only result — reads are off the hot path (bootstrap, preflight, discovery). Only explicitly instrumented write paths record metrics; internal metadata writes (discovery, registration) are excluded.

Metric	Type	Labels	Description
`netsy_object_storage_writes_total`	Counter	`kind`, `mode`, `result`	Object storage write attempts. `kind` is `chunk` or `snapshot`. `mode` is `sync` or `async`.
`netsy_object_storage_write_duration_seconds`	Histogram	`kind`, `mode`, `result`	Object storage write duration.
`netsy_object_storage_write_bytes`	Histogram	`kind`, `mode`	Payload size written to object storage.
`netsy_object_storage_reads_total`	Counter	`result`	Object storage read attempts.
`netsy_object_storage_read_duration_seconds`	Histogram	`result`	Object storage read duration.

Snapshots

Snapshot creation is a Primary-only maintenance operation that compacts Chunk files into a single Snapshot file. These metrics are separate from the general object storage write metrics to give visibility into snapshot scheduling and lifecycle.

Metric	Type	Labels	Description
`netsy_primary_snapshot_creations_total`	Counter	`result`	Snapshot creation attempts by the Primary.
`netsy_primary_snapshot_creation_duration_seconds`	Histogram	`result`	End-to-end duration of snapshot creation, including reading records from SQLite and uploading to object storage.
`netsy_primary_snapshot_age_seconds`	Gauge		Seconds since the last successful snapshot was created, computed at collect-time. Useful for alerting when snapshots are not being created on schedule.

Retries

Every retry path has a corresponding counter so operators can see degradation before it becomes a failure.

Metric	Type	Labels	Description
`netsy_retries_total`	Counter	`operation`	Retry attempts by operation. Values include `object_storage_write`, `heartbeat_send`, `receipt_send`, `compaction_confirmation`, `election_contact`, `node_registration`.

Service Discovery and Registration

Metric	Type	Labels	Description
`netsy_node_registration_duration_seconds`	Histogram	`result`	Duration of a Node’s registration attempt with the Elector during loading.
`netsy_elector_auto_deregistrations_total`	Counter		Number of Nodes automatically deregistered by the Elector after exceeding `elector.deregistration_timeout`.

Watches and Compaction

Note: netsy_watch_min_revision is emitted per Node. Detecting compaction-blocking skew requires comparing this metric across Prometheus instance labels (e.g. min(netsy_watch_min_revision) by (instance)).

Metric	Type	Labels	Description
`netsy_watchers`	Gauge		Number of connected Watchers on this Node.
`netsy_watches`	Gauge		Number of active Watches on this Node.
`netsy_watch_min_revision`	Gauge		Minimum revision across active Watches on this Node. If there are no active Watches, this equals `netsy_committed_revision`.
`netsy_compaction_duration_seconds`	Histogram	`result`	Duration of local compaction work on an individual Node after a compaction revision has been accepted.
`netsy_primary_compactions_total`	Counter	`result`	Compaction coordination runs initiated by the current Primary.
`netsy_primary_compaction_coordination_duration_seconds`	Histogram	`result`	Duration of a cluster-wide compaction coordination run on the Primary.
`netsy_primary_compaction_confirmation_failures_total`	Counter	`reason`	Failed compaction confirmations by reason.

Structured Logging

Netsy uses structured logs for all operationally significant events. Each log entry includes at minimum:

msg
cluster_id
node_id
timestamp

Logs should use stable event names and stable key names. Put detailed error text in error. Put short, bounded failure categories in reason so logs can be aggregated and alerted on safely.

State and Lifecycle Events

Logged whenever a Node changes state or enters or exits a major lifecycle phase.

Key	Description
`msg`	`state_transition`, `loading_stage_started`, `loading_stage_completed`, `loading_restarted`, `primary_preflight_stage_started`, `primary_preflight_stage_completed`, `drain_started`, `drain_completed`
`state_type`	`health`, `elector`, or `primary` for `state_transition`
`previous`	Previous state value
`new`	New state value
`stage`	Current loading, preflight, or drain stage
`reason`	Bounded trigger or failure reason
`duration_ms`	Stage duration when completed
`error`	Optional error text

When local database state is newly created, discarded, or rebuilt, Netsy should emit dedicated lifecycle logs such as local_db_initialized, local_db_rebuild_started, and local_db_rebuild_completed with a bounded reason field.

Election Events

Logged when an election starts, advances, completes, or fails.

Key	Description
`msg`	`election_started`, `election_stage_completed`, `election_completed`, or `election_failed`
`role`	`elector` or `primary`
`stage`	Election stage name
`elected_node_id`	The Node elected on completion
`reason`	Failure reason on failure
`registered_nodes`	Number of registered Nodes
`contacted_nodes`	Number of Nodes successfully contacted
`healthy_candidates`	Number of healthy candidate Replicas considered
`duration_ms`	Election duration

Write Path and Transaction Events

Logged when the Primary switches write mode or when a write fails in an operationally interesting way.

Key	Description
`msg`	`write_path_switched`, `write_transaction` (debug level), `quorum_rollback`, `write_failed`, `chunk_buffer_flush_started`, `chunk_buffer_flush_completed`
`path`	Current write path
`from`	Previous write path on switch
`to`	New write path on switch
`reason`	Bounded trigger or failure reason
`required_receipts`	Quorum threshold at the time
`received_receipts`	Number of Receipts received for the transaction
`healthy_replicas`	Primary’s healthy Replica count
`revision`	Assigned revision for `write_transaction`
`trigger`	Chunk buffer flush trigger
`duration_ms`	Flush or write duration
`error`	Optional error text

The write_transaction message is logged at debug level for every completed write transaction on the Primary. It includes path, revision, healthy_replicas, required_receipts, received_receipts (quorum only), and duration_ms. This is intentionally debug-level to avoid noise during normal operation, but is invaluable for diagnosing why individual transactions chose a particular write path.

Registration Events

Logged by the Elector when Node registration changes.

Key	Description
`msg`	`node_registered` or `node_deregistered`
`target_node_id`	The Node that registered or deregistered
`member_id`	The stable etcd `member_id` assigned or re-used
`trigger`	`startup`, `direct`, or `auto`
`reason`	Bounded trigger for deregistration (e.g. `timeout`, `shutdown`, `manual`)
`duration_ms`	Registration duration for `node_registered`
`error`	Optional error text on registration failure

Object Storage and Compaction Events

Key	Description
`msg`	`object_storage_write`, `compaction_started`, `compaction_notice_failed`, `compaction_completed`
`kind`	`chunk` or `snapshot`
`mode`	`sync`, `async`, or `maintenance`
`revision`	Relevant revision for the operation
`compaction_revision`	Compaction revision when relevant
`reason`	Bounded failure reason
`duration_ms`	Operation duration
`error`	Optional error text

Debugging

Key Relationships

When diagnosing issues, the following metric relationships are useful:

Quorum eligibility: compare netsy_primary_healthy_replicas and netsy_primary_required_receipts. If healthy Replicas fall below the required threshold, netsy_primary_write_path{path="sync"} becomes 1.
Replica-specific degradation: compare netsy_primary_healthy_replicas, netsy_primary_required_receipts, netsy_primary_replication_streams, and each Replica’s own netsy_replica_receipt_age_seconds. Use structured logs to identify which specific Replica is timing out, disconnected, or no longer counted toward quorum.
Replication lag: compare netsy_latest_revision and netsy_committed_revision across Nodes to identify Replicas that are behind the current committed point.
Object storage lag: on the Primary, compare netsy_latest_revision and netsy_primary_object_storage_revision to measure how far async object storage writes are behind quorum-committed data.
Buffer pressure: rising netsy_primary_chunk_buffer_bytes together with rising netsy_primary_chunk_buffer_age_seconds suggests async flushes are not keeping up.
Retry pressure: rising netsy_retries_total for operation="object_storage_write", operation="heartbeat_send", or operation="receipt_send" indicates degradation in progress. A sustained increase in object storage write retries may precede the Primary entering Draining.
Loading stalls: if netsy_state_health{state="loading"} remains 1, inspect recent loading_stage_started and loading_stage_completed logs together with netsy_loading_stage_duration_seconds to see which startup step is slow or failing.
Local DB rebuild churn: increases in netsy_local_db_rebuilds_total indicate repeated local-database resets or rebuilds. Correlate with local_db_rebuild_started and local_db_rebuild_completed logs to see the trigger.
Election stalls: inspect netsy_elector_primary_election_duration_seconds, netsy_elector_primary_election_failures_total, netsy_elector_primary_election_contacts_total, and election logs to determine whether the cluster is blocked on prior-Primary contact, node contactability, or candidate validation.
Compaction stalls: rising netsy_watch_min_revision skew across Nodes (compare by Prometheus instance label), long netsy_compaction_duration_seconds, or repeated increments in netsy_primary_compaction_confirmation_failures_total indicate watch-admission, confirmation, or local compaction-work issues.
Snapshot staleness: a rising netsy_primary_snapshot_age_seconds or repeated netsy_primary_snapshot_creations_total{result="error"} increments indicate snapshots are not being created, which increases loading/recovery time for new Nodes.
Registration issues: rising netsy_retries_total{operation="node_registration"} or long netsy_node_registration_duration_seconds during loading indicate Service Discovery or Elector connectivity problems. Correlate with node_registered / node_deregistered logs.
Client API health: rising netsy_client_requests_total{result="error"} or elevated netsy_client_request_duration_seconds indicates client-facing degradation. Compare netsy_replica_proxy_requests_total error rates on Replicas to determine whether failures originate from the Replica’s proxy path or the Primary itself.

Alerting Notes

Alerts should be written with Prometheus staleness semantics in mind. Role-specific metrics disappear when a Node is no longer the Primary or Elector.
Prefer alerting on sustained conditions using for: rather than single scrape failures.
Prefer counters for rate-based alerts and gauges for current-role or current-health dashboards.

Compatibility With etcd