How to Build a Search System for Your Web App (Full-Text, Filters, and Facets)

Search is one of those features that seems simple until you actually build it. A text input, a query, some results. But the gap between a basic LIKE '%term%' query and a search system that actually helps users find things is enormous. We have built search into several of our products, and the approach varies wildly depending on scale, data shape, and user expectations.

This post covers how we think about search — starting with PostgreSQL full-text search (which handles more than most people realize), adding faceted filtering, implementing autocomplete, and knowing when to reach for a dedicated search engine like Meilisearch or Algolia.

Start with PostgreSQL Full-Text Search

If your data lives in PostgreSQL and you are not dealing with millions of documents, you probably do not need a separate search service. PostgreSQL full-text search is remarkably capable, and it ships with your database. No extra infrastructure, no sync jobs, no consistency headaches.

The core concepts are tsvector (a processed document) and tsquery (a processed search query). PostgreSQL tokenizes text, removes stop words, and reduces words to their stems (lexemes). This means searching for “running” also matches “run” and “runs.”

Here is a basic setup for a locations table, similar to what we built for Spots Mexico:

-- Add a generated tsvector column
ALTER TABLE locations ADD COLUMN search_vector tsvector
  GENERATED ALWAYS AS (
    setweight(to_tsvector('english', coalesce(name, '')), 'A') ||
    setweight(to_tsvector('english', coalesce(description, '')), 'B') ||
    setweight(to_tsvector('english', coalesce(city, '')), 'C') ||
    setweight(to_tsvector('english', coalesce(tags_text, '')), 'D')
  ) STORED;

-- Create a GIN index for fast lookups
CREATE INDEX idx_locations_search ON locations USING GIN (search_vector);

The setweight function is critical. It lets you assign importance levels (A through D) to different fields. A match in the location name ranks higher than a match in the description, which ranks higher than a match in the city name. This is how you get relevant results without building a custom ranking algorithm.

Querying is straightforward:

SELECT
  name,
  city,
  ts_rank(search_vector, query) AS rank
FROM
  locations,
  plainto_tsquery('english', 'beach sunset photography') AS query
WHERE
  search_vector @@ query
ORDER BY
  rank DESC
LIMIT 20;

plainto_tsquery converts a plain string into a tsquery, joining terms with AND. If you want OR behavior or phrase matching, use to_tsquery or phraseto_tsquery instead.

For Spots Mexico, this approach handled thousands of photography locations with sub-50ms query times. No external service needed.

Handling Faceted Filters

Search without filters is just a text box. Real search systems let users narrow results by category, price range, location, date, or any other attribute. These are faceted filters, and they require a different strategy than full-text search.

The key challenge with facets is computing counts. Users expect to see “Beaches (47)” next to each filter option, and those counts need to update as other filters are applied. This means your query has to return both the filtered results and the facet counts in a single round trip (or at least appear to).

Here is how we structure faceted queries in PostgreSQL:

WITH filtered AS (
  SELECT *
  FROM locations
  WHERE
    search_vector @@ plainto_tsquery('english', :search_term)
    AND (:city IS NULL OR city = :city)
    AND (:min_rating IS NULL OR rating >= :min_rating)
)
SELECT
  'results' AS type,
  jsonb_build_object(
    'id', id,
    'name', name,
    'city', city,
    'rating', rating
  ) AS data
FROM filtered
ORDER BY ts_rank(search_vector, plainto_tsquery('english', :search_term)) DESC
LIMIT 20

UNION ALL

SELECT
  'facet_city' AS type,
  jsonb_build_object('value', city, 'count', count(*)) AS data
FROM filtered
GROUP BY city

UNION ALL

SELECT
  'facet_rating' AS type,
  jsonb_build_object('value', floor(rating), 'count', count(*)) AS data
FROM filtered
GROUP BY floor(rating);

The CTE (WITH filtered) applies the text search and any active filters once, and then multiple UNION ALL branches compute facet counts from the same filtered set. This avoids running the base query multiple times.

On the TypeScript side, we parse these results into a structured response:

type FacetValue = { value: string; count: number };

type SearchResponse<T> = {
  results: T[];
  facets: Record<string, FacetValue[]>;
};

function parseSearchResults<T>(rows: { type: string; data: T | FacetValue }[]): SearchResponse<T> {
  const results: T[] = [];
  const facets: Record<string, FacetValue[]> = {};

  for (const row of rows) {
    if (row.type === "results") {
      results.push(row.data as T);
    } else if (row.type.startsWith("facet_")) {
      const facetName = row.type.replace("facet_", "");
      if (!facets[facetName]) facets[facetName] = [];
      facets[facetName].push(row.data as FacetValue);
    }
  }

  return { results, facets };
}

Building Autocomplete That Does Not Lag

Autocomplete is the feature users notice most. It has to be fast (under 100ms perceived latency) and forgiving of typos. PostgreSQL can handle this, but you need a different approach than full-text search.

For prefix matching (the most common autocomplete pattern), use trigram similarity with the pg_trgm extension:

-- Enable the extension
CREATE EXTENSION IF NOT EXISTS pg_trgm;

-- Create a trigram index
CREATE INDEX idx_locations_name_trgm ON locations USING GIN (name gin_trgm_ops);

-- Query with similarity scoring
SELECT
  name,
  similarity(name, :input) AS sim
FROM locations
WHERE name % :input  -- % is the similarity operator
ORDER BY sim DESC
LIMIT 10;

Trigram matching is typo-tolerant. Searching for “cancn” still matches “Cancun” because they share enough character trigrams. You can control the similarity threshold with SET pg_trgm.similarity_threshold = 0.3; (lower values are more permissive).

For autocomplete specifically, we often combine prefix matching with trigram similarity:

SELECT name, city
FROM locations
WHERE
  name ILIKE :input || '%'
  OR name % :input
ORDER BY
  CASE WHEN name ILIKE :input || '%' THEN 0 ELSE 1 END,
  similarity(name, :input) DESC
LIMIT 10;

Prefix matches come first (the user typed the beginning of the word correctly), and fuzzy matches fill in the rest.

On the frontend, debounce the input to 200-300ms and cancel in-flight requests when the user keeps typing:

let controller: AbortController | null = null;

async function handleSearchInput(input: string) {
  controller?.abort();
  controller = new AbortController();

  if (input.length < 2) return;

  try {
    const res = await fetch(`/api/search/autocomplete?q=${encodeURIComponent(input)}`, {
      signal: controller.signal,
    });
    const suggestions = await res.json();
    renderSuggestions(suggestions);
  } catch (err) {
    if (err instanceof DOMException && err.name === "AbortError") return;
    throw err;
  }
}

Relevance Ranking Beyond the Basics

PostgreSQL’s ts_rank function is decent but limited. For better relevance, you often need to combine text rank with other signals. We typically build a composite score:

SELECT
  id,
  name,
  (
    ts_rank(search_vector, query) * 0.5
    + (rating / 5.0) * 0.2
    + (CASE WHEN featured THEN 0.2 ELSE 0 END)
    + (1.0 / (1 + EXTRACT(EPOCH FROM now() - created_at) / 86400)) * 0.1
  ) AS composite_score
FROM
  locations,
  plainto_tsquery('english', :search_term) AS query
WHERE
  search_vector @@ query
ORDER BY
  composite_score DESC
LIMIT 20;

This blends text relevance (50%), user rating (20%), featured status (20%), and recency (10%). The weights are tunable and should be adjusted based on what your users actually find helpful.

For Spots Mexico, we weighted location popularity and photo quality alongside text relevance. A location with stunning photos and high engagement should rank above a perfectly keyword-matched but rarely visited spot.

When to Reach for Meilisearch or Algolia

PostgreSQL full-text search breaks down in a few scenarios:

You need sub-10ms response times at scale. Once you hit hundreds of thousands of documents with complex queries, PostgreSQL starts to struggle. Dedicated search engines use inverted indices optimized for this exact workload.

You need advanced typo tolerance and language processing. Meilisearch and Algolia handle typos, synonyms, stop words, and language-specific stemming far better out of the box.

You need geo-search combined with text search. While PostgreSQL has PostGIS, combining spatial queries with full-text search in a single performant query is tricky. Meilisearch handles this natively.

Your search index differs significantly from your primary data model. When you need to denormalize data for search (combining data from multiple tables into a single searchable document), maintaining a separate search index makes the architecture cleaner.

We recommend Meilisearch for most projects. It is open source, self-hostable, and has an excellent developer experience. The API is clean, filtering and faceting work out of the box, and it handles typos beautifully. Deployment is a single binary.

import { MeiliSearch } from "meilisearch";

const client = new MeiliSearch({ host: "http://localhost:7700", apiKey: "your-key" });

// Index documents
await client.index("locations").addDocuments(locations);

// Configure searchable attributes and facets
await client.index("locations").updateSettings({
  searchableAttributes: ["name", "description", "city"],
  filterableAttributes: ["city", "category", "rating"],
  sortableAttributes: ["rating", "created_at"],
});

// Search with filters and facets
const results = await client.index("locations").search("beach photography", {
  filter: ["city = 'Tulum'", "rating >= 4"],
  facets: ["city", "category"],
  limit: 20,
});

Algolia is the right choice if you need a managed service with global CDN distribution and zero ops overhead. It is significantly more expensive, but the search quality and infrastructure reliability are excellent. For enterprise clients or products where search is the core feature, the cost is justified.

The Sync Problem

If you use an external search engine, you need to keep it in sync with your database. This is the hardest part of the architecture, and there are three common approaches:

Dual writes — write to both the database and search index in your application code. Simple but fragile. If one write fails, they are out of sync.

Change Data Capture (CDC) — use database triggers or a tool like Debezium to capture changes and push them to the search index. Reliable but complex to set up.

Periodic re-indexing — rebuild the search index on a schedule (every 5 minutes, every hour). Simple and reliable, but introduces lag. For many applications, a few minutes of delay is perfectly acceptable.

We usually start with periodic re-indexing and only move to CDC when the delay becomes a user-facing problem. For Trackelio, where feedback search needs to reflect new submissions quickly, we use a hybrid: periodic full re-indexes plus immediate pushes for new documents.

Search UX Matters More Than Search Tech

The best search engine in the world is useless if the search interface is confusing. A few principles we follow:

Show results as the user types. Do not wait for them to press Enter. Autocomplete and instant results set expectations for modern search.

Preserve filter state in the URL. Users expect to share search results or use the back button. Encode filters as query parameters.

Show empty states with suggestions. “No results for X” is not helpful. Suggest alternative searches, show popular items, or prompt the user to broaden their filters.

Highlight matched terms in results. This gives users confidence that the search understood their intent.

These are not technical problems — they are design decisions that make the difference between search that works and search that people actually use.

Wrapping Up

Start with PostgreSQL full-text search. It handles more than you think, the operational complexity is zero (it is your database), and upgrading to a dedicated search engine later is a well-understood migration path. Add faceted filters with CTEs, build autocomplete with pg_trgm, and only reach for Meilisearch or Algolia when you hit the limits of what PostgreSQL can do.

If you are building a product that needs search and want to get the architecture right from the start, reach out at hello@threshline.com.