Search

Semantic search is the Hub’s most-used synchronous surface. A user types a phrase; the Hub returns feedback records ordered by meaning-similarity to that phrase. The full implementation lives in hivecfm-hub/internal/service/search_service.go.

The request/response shape

POST /search
{
  "tenant_id": "...",
  "query":     "the login button is confusing",
  "limit":     20,
  "min_score": 0.55,
  "cursor":    null,
  "filters":   { "source_id": "...", "since": "2026-01-01T00:00:00Z" }
}

Returns a list of { feedback_record_id, score } and a next_cursor when more pages exist.

Inside `SearchService`

The struct (internal/service/search_service.go):

type SearchService struct {
    embeddingClient     EmbeddingClient
    embeddingsRepo      EmbeddingsRepositoryForSearch
    sentimentClassifier SentimentClassifier
    model               string
    queryCache          *lru.Cache[string, []float32]
    queryLoadGroup      singleflight.Group
    cacheMetrics        observability.CacheMetrics
    logger              *slog.Logger
}

Each dependency maps to a step in the search path.

The search path, step by step

1. Input validation

Non-empty tenant_id (sentinel ErrMissingTenantID) and non-empty trimmed query (ErrEmptyQuery). Everything scopes by tenant — embeddings from tenant A are never visible to tenant B.

2. Embed the query

The query string is turned into a 768-d vector by the embedding client. Two optimisations:

LRU cache. queryCache holds the last N (query, model) → vector pairs. A repeat of the same query skips the embedding round-trip.
Singleflight. If 100 concurrent requests hit a cold cache for the same query, singleflight.Group lets exactly one perform the embedding call; the other 99 wait on the same future. This prevents a thundering herd on a popular search.

Cache hit/miss/load events emit through cacheMetrics (see DevOps / Observability).

3. Nearest-neighbour query

embeddingsRepo.NearestFeedbackRecordsByEmbedding(...) runs something like:

SELECT fr.id, fr.content, (e.embedding <=> $query_vec) AS distance
FROM embeddings e
JOIN feedback_records fr ON fr.id = e.feedback_record_id
WHERE fr.tenant_id = $tenant_id
  AND e.model      = $model
  AND (e.embedding <=> $query_vec) <= $max_distance
ORDER BY e.embedding <=> $query_vec
LIMIT $limit + 1;

Postgres uses the HNSW index (USING hnsw (embedding halfvec_cosine_ops)) added in hivecfm-hub/migrations/004_add_feedback_records_embedding.sql. The +1 is the “has-more?” sentinel for pagination.

4. Optional sentiment re-rank

If the service was constructed with a SentimentClassifier, the user’s query gets classified (positive / negative / neutral) and matching-sentiment results get a score boost (up to +35%); opposing-sentiment results get a penalty (up to -35%). Constants:

const (
    sentimentBoostFactor   = 0.35
    sentimentPenaltyFactor = 0.35
)

This is a heuristic, not a semantic guarantee — it is tuned for “looking for complaints” vs “looking for praise” UX.

5. Cursor pagination

Instead of OFFSET, the service emits an opaque cursor encoding (last_distance, last_feedback_record_id). The next page uses NearestFeedbackRecordsByEmbeddingAfterCursor(...) which filters with (distance, id) > (cursor) — stable even as new embeddings are added between page loads.

Failure modes and retries

Embedding provider is down. The call returns an error; the caller gets a 5xx. The Hub does not retry at the HTTP boundary — retry is the caller’s responsibility. Background workers (not search) have River-level retries.
Empty result set. Returns [] and no cursor. Not an error.
Missing embedding for a record. If the nearest-neighbour finds a record but a join fails (e.g. embedding deleted mid-query), the row is silently skipped. Logged at warn.

The similar-feedback lookup (GetEmbeddingByFeedbackRecordAndModelAndTenant) reuses the same repo — it loads one record’s vector and runs the same nearest-neighbour query. No query embedding needed.

Observability

Every search emits metrics via observability.CacheMetrics (hit/miss/load) and through the service’s generic histogram on SemanticSearch duration. See the dashboards section in DevOps / Observability.