Setting Up AI Search

This guide walks you through enabling semantic search for your project. By the end, you’ll have a working search endpoint that understands natural language.

Prerequisites

A Foir project with content (records in one or more models)
An API key with the appropriate scopes:
- search:read for semantic search
At least one model with embeddings generated for its records (see Step 1)

What You’ll Build

A search experience that:

Finds content by meaning, not just keywords
Works across all your models
Returns results ranked by semantic similarity

Step 1: How a Model Becomes Searchable

There is no separate switch to flip — a model becomes searchable as soon as its records have embeddings. You create those embeddings by generating them for a record (generateEmbedding, covered in Step 2) or by pushing a precomputed vector with writeEmbeddings. Each model’s embeddings are queried through its typed search<Model>s field, plus the generic searchRecords.

To see coverage in the Foir admin dashboard:

Navigate to Settings > Models
Select the model you care about
Open the Embeddings tab

The Embeddings tab is a read-only overview: embedded records, records-with-embeddings vs. total, coverage percentage, and how many records are still pending. It only appears once the model has at least one embedding — until then there is nothing to show.

Tip: Coverage and pending counts are the fastest way to confirm a model is ready before you wire search into a live experience.

Step 2: Generate Embeddings

Embeddings are not produced automatically when you save a record — you trigger generation explicitly. Use the generateEmbedding mutation (or the foir embeddings CLI) to embed a record, then use the coverage probes (embeddingCoverage / recordsMissingEmbedding, shown under Auditing Coverage) to find any records still missing an embedding.

Generate for a Single Record

generateEmbedding enqueues an embedding job and returns a Boolean. It requires the search:semantic:write scope.


mutation {
  generateEmbedding(
    recordId: "rec_abc123"
    modelKey: "product"
  )
}

Verify Embeddings Exist

Records carry per-record embedding signals you can read directly (these need search:semantic:read):


query {
  product(id: "rec_abc123") {
    _id
    _hasEmbedding(key: "default")
    _embeddingContentHash(key: "default")
  }
}

Note: Embedding generation happens asynchronously. After you trigger generation, there may be a brief delay before the embedding is available for search.

Step 3: Search Your Content

Per-Model Search

Each model that has embeddings gets a typed search<Model>s query. It returns { score, record } hits where record is the model’s normal type, so you can select any of its fields. Requires the search:read:<model> scope.


query {
  searchProducts(query: "comfortable shoes for long walks", first: 5) {
    score
    record {
      _id
      title
      price
    }
  }
}

Cross-Model Search

To search across model types, use the generic searchRecords query (scope search:read). Pass an optional modelKey to narrow it.


query {
  searchRecords(query: "return and refund information", first: 3) {
    recordId
    modelKey
    naturalKey
    score
  }
}

Using the Results

Here’s a complete example in TypeScript:


const SEARCH_QUERY = `
  query SearchProducts($query: String!, $first: Int) {
    searchProducts(query: $query, first: $first) {
      score
      record {
        _id
        title
      }
    }
  }
`;
 
async function searchProducts(userQuery: string) {
  const response = await fetch('https://api.foir.dev/graphql', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-api-key': process.env.FOIR_API_KEY!,
    },
    body: JSON.stringify({
      query: SEARCH_QUERY,
      variables: { query: userQuery, first: 10 },
    }),
  });
 
  const { data } = await response.json();
  return data.searchProducts;
}
 
// Usage
const results = await searchProducts("waterproof hiking boots");
// Returns products matching the intent, even if they don't
// contain the exact words "waterproof hiking boots"

Step 4: Set Up Your API Key

Make sure your API key has the right scopes for the features you need:

Feature	Required Scope
Per-model search (`search<Model>s`)	`search:read:<model>`
Cross-model search (`searchRecords`)	`search:read`
Coverage / digest queries, `_hasEmbedding`	`search:semantic:read`
Generate / write embeddings	`search:semantic:write`

To update scopes:

Go to Settings > API Keys in the admin dashboard
Edit your API key
Add the required scopes
Save

Tips and Best Practices

Choosing Fields to Embed

Do include: title, description, body content, tags, category names
Don’t include: IDs, timestamps, internal slugs, boolean flags
More text generally produces better search results, but overly long content dilutes relevance

Interpreting Scores

Each hit carries a score. Use it to filter low-quality matches on the client side:

Score Range	Quality
0.9 – 1.0	Near-exact semantic match
0.7 – 0.9	Strong relevance
0.5 – 0.7	Moderate relevance
Below 0.5	Weak match — usually noise

Cap first to the number of results you’ll actually show, and drop hits below a score cutoff (around 0.6) that you tune to your content.

Narrowing the Search Space

When you know which model the user is looking for, query that model’s typed search<Model>s field directly — it only searches that model and returns its full typed record. For broader queries, searchRecords accepts an optional modelKey to restrict the generic search:


query {
  searchProducts(query: "eco-friendly packaging", first: 5) {
    score
    record {
      _id
      title
    }
  }
}

Keeping Embeddings Fresh

Re-run generateEmbedding (or foir embeddings) after a record’s content changes to refresh its embedding — saving or publishing a record does not re-embed it on its own
If a record’s content hasn’t changed, re-embedding is skipped (deduplication is automatic)
Use the coverage queries below to confirm every record is embedded before going live with search

Auditing Coverage and Running a Sweep

If you generate embeddings from your own service (for example a background worker), you can audit coverage and repair gaps without re-embedding your whole corpus. All three queries are scoped to your own records.

Get a per-model summary of how many records are embedded versus pending:


query {
  embeddingCoverage(modelKey: "note") {
    modelKey
    totalRecords
    embeddedRecords
    pendingRecords
    lastEmbeddedAt
  }
}

List exactly which records are missing an embedding for a key, paginated, so you can embed only the gaps:


query {
  recordsMissingEmbedding(modelKey: "note", key: "default", first: 100) {
    edges {
      node {
        recordId
        naturalKey
      }
    }
    pageInfo {
      hasNextPage
      endCursor
    }
    totalCount
  }
}

Before re-embedding, read back the content hash you stored alongside each embedding. If it matches the current content’s hash, skip it:


query {
  embeddingDigests(
    modelKey: "note"
    key: "default"
    recordIds: ["rec_abc123", "rec_def456"]
  ) {
    recordId
    contentHash
  }
}

When you write an embedding, pass the contentHash you computed so a later sweep can read it back:


mutation {
  writeEmbeddings(input: {
    entries: [{
      recordId: "rec_abc123"
      key: "default"
      embedding: [0.0123, -0.0456, 0.0789]
      contentHash: "sha256:9f86d0…"
    }]
  })
}

Records carry the same signals as fields, handy for quick per-record checks:


query {
  note(id: "rec_abc123") {
    _id
    _hasEmbedding(key: "default")
    _embeddingContentHash(key: "default")
  }
}

A typical sweep: call recordsMissingEmbedding to find gaps, embeddingDigests to skip records whose contentHash is unchanged, embed the rest, and write them back with writeEmbeddings (including the new contentHash).

Next Steps

Search & Embeddings API — Full API reference
Semantic Search — Understand the concepts
Model Capabilities — Configure models for search