pgvector and Search
HiveCFM ships semantic search over feedback records — a respondent types “the login button is confusing” and the app surfaces every related response, even ones that never contain the literal word “login”. The plumbing is built on pgvectorpgvectorA Postgres extension that adds a vector column type for similarity search. Used for AI-powered survey insights. plus the Go HubHubThe Go service that owns background processing, integrations, and the admin API. Sibling to Core.’s SearchService.
What pgvector is
pgvectorpgvectorA Postgres extension that adds a vector column type for similarity search. Used for AI-powered survey insights. is a Postgres extension that adds a vector(N) column type and similarity operators (<=> cosine distance, <-> L2, <#> inner product). It lets Postgres act as a vector database without introducing a second data store.
The schema registers the extension in hivecfm-core/packages/database/schema.prisma:
extensions = [pgvector(map: "vector")]Embeddings, briefly
An embedding is a fixed-length array of floats that captures a piece of text’s meaning. Two pieces of text whose meanings are close will have embedding vectors that are close in the vector space (cosine similarity near 1.0). We use an embedding model — configurable per tenant — to produce one vector per feedback record.
Cosine similarity (the measure we use) = dot product of two vectors divided by the product of their magnitudes. Ranges from -1 to 1; 1.0 means identical direction.
Where embeddings live
The canonical embedding storage is in the HubHubThe Go service that owns background processing, integrations, and the admin API. Sibling to Core.’s database, not core’s. From hivecfm-hub/migrations/004_add_feedback_records_embedding.sql:
CREATE TABLE embeddings (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
feedback_record_id UUID NOT NULL REFERENCES feedback_records(id) ON DELETE CASCADE,
embedding halfvec(768) NOT NULL,
model TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE (feedback_record_id, model)
);
CREATE INDEX idx_embeddings ON embeddings USING hnsw (embedding halfvec_cosine_ops);Worth noting:
halfvec(768)— half-precision floats (2 bytes per dimension instead of 4). Saves 50% of storage with under 1% impact on recall.modelcolumn — lets us run A/B against multiple embedding models by storing one row per(feedback_record, model)pair.- HNSW index — Hierarchical Navigable Small World, an approximate-nearest-neighbour index tuned for fast top-K lookups. Built on
halfvec_cosine_ops, i.e. cosine similarity.
How embeddings get written
Every new feedback record enqueues a FeedbackEmbeddingArgs RiverRiverThe Go background-job queue Hub uses. Jobs are rows in Postgres, so there is no separate broker to run. job. The worker in hivecfm-hub/internal/workers/feedback_embedding.go:
- Reads the feedback record by id.
- Calls the embedding client to turn the text into a 768-d vector.
- Writes it to
embeddingswith(feedback_record_id, model)as the upsert key.
If the embedding call fails, RiverRiverThe Go background-job queue Hub uses. Jobs are rows in Postgres, so there is no separate broker to run. retries with backoff.
How search is served
Semantic search goes through hivecfm-hub/internal/service/search_service.go. The entry point is SearchService.SemanticSearch:
func (s *SearchService) SemanticSearch(
ctx context.Context, query, tenantID string,
limit int, minScore float64, cursor string,
filters *models.SearchFilters,
) (SearchResult, error)The flow:
- Embed the query. Call the embedding client on the user’s query string. Cache by (query, model) in an LRU so repeat searches never re-embed.
- Run nearest-neighbour.
SELECT ... ORDER BY embedding <=> $query_vec LIMIT $k WHERE tenant_id = $tenant. Postgres uses the HNSW index; the scan is O(log N) in practice. - Optional sentiment re-rank. If the service is constructed with a
SentimentClassifier, queries classified as positive/negative get a score boost or penalty on results with matching sentiment (capped at ±35%). - Cursor pagination. The last-distance + last-id pair is encoded as an opaque cursor so the next page can resume in the same index scan.
Singleflight ensures that if 100 requests hit the same cold cache at once, only one embedding round-trip happens — the other 99 wait on the same future. This matters for “heavy load on a popular query” moments.
Tuning knobs
minScore— consumer-side cosine threshold. Below this, results are dropped before pagination.hnsw.ef_search— a Postgres session GUC that controls the HNSW traversal budget. Higher = better recall, slower query. Set at connection time when needed.limit— page size. The service also enforces an internal max.
Extending the search surface
Search is exposed at the HubHubThe Go service that owns background processing, integrations, and the admin API. Sibling to Core.’s HTTP API, not Prisma. If you need it from the web app, call the Hub via the internal Hub client — do not add a new SQL query on the web side. This preserves the “Hub owns embeddings” boundary and keeps the LRU cache effective.
Read next
- Hub / Search — the Hub side of the same story, from the service-layout perspective.
- Hub / Workers — where the
FeedbackEmbeddingWorkerlives.