Setting Up AI Search
This guide walks you through enabling semantic search for your project. By the end, you’ll have a working search endpoint that understands natural language.
Prerequisites
- A Foir project with content (records in one or more models)
- An API key with the appropriate scopes:
search:readfor semantic search
- Search features enabled for your project (check with
isAIEnabledquery)
What You’ll Build
A search experience that:
- Finds content by meaning, not just keywords
- Works across all your models
- Returns results ranked by semantic similarity
Step 1: Enable Embeddings on Your Models
Embeddings need to be enabled per model. In the Foir admin dashboard:
- Navigate to Settings > Models
- Select the model you want to make searchable
- Open the AI & Search settings tab
- Toggle Enable Embeddings
- Select which fields to include in the embedding (e.g.,
title,description,content) - Save the model
Tip: Include the fields that best describe the record’s content. A product might use
title+description, while a blog post might usetitle+content.
Step 2: Generate Embeddings
Once embeddings are enabled, new and updated records will be embedded automatically. For existing content, you can trigger embedding generation:
Generate for a Single Record
mutation {
generateEmbedding(
recordId: "rec_abc123"
modelKey: "product"
source: RECORD
) {
success
tokenCount
dimensions
}
}Verify Embeddings Exist
query {
embeddingStatus(
recordIds: ["rec_abc123", "rec_def456"]
source: RECORD
) {
recordId
hasEmbedding
lastUpdated
}
}Note: Embedding generation happens asynchronously. After creating or updating content, there may be a brief delay before the embedding is available for search.
Step 3: Search Your Content
Basic Semantic Search
query {
semanticSearch(input: {
query: "comfortable shoes for long walks"
limit: 5
}) {
recordId
modelKey
naturalKey
similarity
highlights
}
}Filtered Search
Narrow results to specific models:
query {
semanticSearch(input: {
query: "return and refund information"
modelKeys: ["faq", "policy"]
limit: 3
threshold: 0.7
}) {
recordId
naturalKey
similarity
highlights
}
}Using the Results
Here’s a complete example in TypeScript:
const SEARCH_QUERY = `
query SemanticSearch($query: String!, $limit: Int) {
semanticSearch(input: {
query: $query
modelKeys: ["product"]
limit: $limit
threshold: 0.6
}) {
recordId
naturalKey
similarity
highlights
}
}
`;
async function searchProducts(userQuery: string) {
const response = await fetch('https://api.foir.io/graphql', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-api-key': process.env.FOIR_API_KEY!,
},
body: JSON.stringify({
query: SEARCH_QUERY,
variables: { query: userQuery, limit: 10 },
}),
});
const { data } = await response.json();
return data.semanticSearch;
}
// Usage
const results = await searchProducts("waterproof hiking boots");
// Returns products matching the intent, even if they don't
// contain the exact words "waterproof hiking boots"Step 4: Set Up Your API Key
Make sure your API key has the right scopes for the features you need:
| Feature | Required Scope |
|---|---|
| Semantic search | search:read |
| Generate embeddings | records:write |
To update scopes:
- Go to Settings > API Keys in the admin dashboard
- Edit your API key
- Add the required scopes
- Save
Tips and Best Practices
Choosing Fields to Embed
- Do include: title, description, body content, tags, category names
- Don’t include: IDs, timestamps, internal slugs, boolean flags
- More text generally produces better search results, but overly long content dilutes relevance
Setting Similarity Threshold
The threshold parameter filters out low-quality matches:
| Score Range | Quality |
|---|---|
| 0.9 – 1.0 | Near-exact semantic match |
| 0.7 – 0.9 | Strong relevance |
| 0.5 – 0.7 | Moderate relevance |
| Below 0.5 | Weak match — usually noise |
Start with 0.6 and adjust based on your content and use case.
Combining with Model Filters
Semantic search works best when combined with model filters. Use modelKeys to narrow the search space, then let vector similarity handle the ranking:
query {
semanticSearch(input: {
query: "eco-friendly packaging"
modelKeys: ["product"]
limit: 5
threshold: 0.65
}) {
recordId
naturalKey
similarity
}
}Keeping Embeddings Fresh
- Embeddings update automatically when records are saved or published
- If a record’s content hasn’t changed, re-embedding is skipped (deduplication is automatic)
- Use the coverage queries below to confirm every record is embedded before going live with search
Auditing Coverage and Running a Sweep
If you generate embeddings from your own service (for example a background worker), you can audit coverage and repair gaps without re-embedding your whole corpus. All three queries are scoped to your own records.
Get a per-model summary of how many records are embedded versus pending:
query {
embeddingCoverage(modelKey: "note") {
modelKey
totalRecords
embeddedRecords
pendingRecords
lastEmbeddedAt
}
}List exactly which records are missing an embedding for a key, paginated, so you can embed only the gaps:
query {
recordsMissingEmbedding(modelKey: "note", key: "default", first: 100) {
edges {
node {
recordId
naturalKey
}
}
pageInfo {
hasNextPage
endCursor
}
totalCount
}
}Before re-embedding, read back the content hash you stored alongside each embedding. If it matches the current content’s hash, skip it:
query {
embeddingDigests(
modelKey: "note"
key: "default"
recordIds: ["rec_abc123", "rec_def456"]
) {
recordId
contentHash
}
}When you write an embedding, pass the contentHash you computed so a later sweep can read it back:
mutation {
writeEmbeddings(input: {
entries: [{
recordId: "rec_abc123"
key: "default"
embedding: [0.0123, -0.0456, 0.0789]
contentHash: "sha256:9f86d0…"
}]
})
}Records carry the same signals as fields, handy for quick per-record checks:
query {
note(id: "rec_abc123") {
_id
_hasEmbedding(key: "default")
_embeddingContentHash(key: "default")
}
}A typical sweep: call recordsMissingEmbedding to find gaps, embeddingDigests to skip records whose contentHash is unchanged, embed the rest, and write them back with writeEmbeddings (including the new contentHash).
Next Steps
- Search & Embeddings API — Full API reference
- Semantic Search — Understand the concepts
- Model Capabilities — Configure models for search