DevOpsObservability

Observability

HiveCFM is instrumented on two sides: the Node web app in hivecfm-core and the Go service in hivecfm-hub. Both emit structured logs + OpenTelemetry signals. This page is the truth about what is wired up today.

Logging

Core (Node)

hivecfm-core/packages/logger/src/logger.ts wraps pino. Output is line-delimited JSON in production and pretty-printed in dev:

import Pino, { type Logger, type LoggerOptions, stdSerializers } from "pino";
 
const IS_PRODUCTION = !process.env.NODE_ENV || process.env.NODE_ENV === "production";
const IS_BUILD = process.env.NEXT_PHASE === "phase-production-build";
// productionConfig -> plain JSON to stdout
// developmentConfig -> pino-pretty for readable local logs

Every app import the logger via @hivecfm/logger — the package exports a singleton that is safe to use from anywhere. The public API matches the standard levels: debug, info, warn, error.

Two practical rules:

  • Log structured fields, not interpolated strings. logger.info({ userId, surveyId }, "survey archived") — not logger.info(`archived ${id}`). The JSON fields are indexable in downstream tools.
  • Never log secrets. pino’s redact list in logger.ts strips common tokens; if you add a new secret-bearing field, extend that list.

Hub (Go)

The HubHubThe Go service that owns background processing, integrations, and the admin API. Sibling to Core. uses Go’s standard-library slog. Configured in hivecfm-hub/cmd/api/main.go:

func setupLogging(level string) {
    var logLevel slog.Level
    switch strings.ToLower(level) {
    case "debug": logLevel = slog.LevelDebug
    case "info":  logLevel = slog.LevelInfo
    // …
    }
    opts := &slog.HandlerOptions{Level: logLevel}
    handler := slog.NewTextHandler(os.Stdout, opts)
    slog.SetDefault(slog.New(handler))
}

Logs go to stdout in the slog text format today. JSON handler can be swapped in by changing one line when we wire a log aggregator.

If you are coming from ILogger<T>, both logger choices are conceptually equivalent — they just ship fewer extension-method helpers. You lose a little syntactic sugar and gain a line-oriented format that every log collector understands.

Metrics — OpenTelemetry on both sides

Core — @opentelemetry/*

hivecfm-core/apps/web/package.json pulls in the full OTel stack:

  • @opentelemetry/sdk-metrics
  • @opentelemetry/exporter-prometheus — exposes a Prometheus scrape endpoint
  • @opentelemetry/host-metrics — process CPU, memory, event loop lag
  • @opentelemetry/instrumentation-http, @opentelemetry/instrumentation-runtime-node — HTTP and Node runtime auto-instrumentation
  • @opentelemetry/sdk-logs — ready for log-signal shipping when enabled

Metrics are served on a Prometheus-scrapable endpoint that the pipeline’s scrape config reads.

Hub — OpenTelemetry SDK

hivecfm-hub/internal/observability/provider.go wires the Go OTel SDK. Metrics go out as OTLP over HTTP to an endpoint configured by OTEL_EXPORTER_OTLP_ENDPOINT:

import (
  "go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp"
  sdkmetric "go.opentelemetry.io/otel/sdk/metric"
  sdktrace  "go.opentelemetry.io/otel/sdk/trace"
)

Enable via OTEL_METRICS_EXPORTER=otlp. The SDK reads endpoint + auth from standard OTel env vars (OTEL_EXPORTER_OTLP_ENDPOINT, OTEL_EXPORTER_OTLP_HEADERS). If the env var is unset, metrics are no-op.

Domain-specific metric types live as interfaces in internal/observability/:

  • EmbeddingMetrics — embedding job duration, success/failure class (internal/observability/embeddings.go).
  • WebhookMetrics — dispatch outcomes, retry counts (internal/observability/webhooks.go).
  • CacheMetrics — search query cache hit/miss/load (internal/observability/cache.go).

Each worker and service takes the relevant interface as a constructor dependency; tests inject no-ops.

Tracing

The HubHubThe Go service that owns background processing, integrations, and the admin API. Sibling to Core. exports traces via OTLP when OTEL_TRACES_EXPORTER=otlp. Span attributes are set via helpers in internal/observability/names.go so names stay consistent across services.

Core’s trace wiring is declared (the SDK packages are installed) but is typically enabled only on the managed cloud target — OTEL env vars are set in the Container Apps config rather than the repo.

Dashboards

Dashboards are not in this repo. They live in whichever observability backend your environment points OTEL at:

  • Managed cloud target → Azure Monitor / Application Insights (via the OTLP exporter → ACA agent).
  • Self-host → any OTLP-compatible sink (Grafana + Tempo + Prometheus is the common pairing).

If you are standing up a new environment, start by pointing OTEL_EXPORTER_OTLP_ENDPOINT at your collector and confirming a metric shows up. Dashboards follow once there is data.

Local dev

In local Docker ComposeDocker ComposeThe YAML file and CLI that spin up Postgres, Redis, MinIO, MailHog locally with one command., neither OTLP exporter is enabled by default — the env vars are empty so the SDK runs but is a no-op. You get logs to stdout and nothing else. That is intentional: dev does not need a collector.

To exercise the metrics path locally, set:

export OTEL_METRICS_EXPORTER=otlp
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318

…and run an OTel collector on port 4318.

  • Hub / Workers — the workers whose *Metrics interfaces are summarised above.
  • Hub / Search — the cache metrics in context.