Skip to content

Training Data

When you publish a shared agent to the marketplace, you can attach a curated knowledge base — training data — that every installer receives. This page covers how to create, manage, and version that training data.

Each agent template has two indexes:

  • Working index — Your mutable workspace where you add, edit, and delete training data entries. Only you (the publisher) can modify this.
  • Version snapshots — Immutable copies of the working index, created automatically each time you publish a new version. Installers read from these snapshots.

This separation means you can freely update your working index without affecting any existing installations. Changes only reach installers when you publish a new version.

There are three ways to populate your agent’s training data:

  1. Navigate to Marketplace and find your published agent template
  2. Click the agent, then click Training Data
  3. Click Add Object to create entries one at a time, or Import JSON for bulk uploads
  4. Each entry needs a unique Object ID, a Title, Content, and optional Keywords for semantic search

If you use Claude Code or another MCP-capable AI tool, you can pipe training data directly into your agent template’s index using the Sprigr MCP server:

Use the sprigr_search tool to import objects into the agent-template-{your-slug} index.

This is particularly useful for curating large knowledge bases conversationally — ask your AI assistant to research a topic and save the findings directly into the training index.

For programmatic workflows, use the training data API endpoints:

  • POST /api/apps/{slug}/training — Import up to 1000 objects per call
  • GET /api/apps/{slug}/training?q=search+term — Search existing training data
  • DELETE /api/apps/{slug}/training/{objectId} — Remove specific entries

Each training data object is a JSON document with these recommended fields:

FieldRequiredDescription
objectIDYesUnique identifier (e.g., ecom-metrics-001)
titleRecommendedShort, descriptive title for the entry
contentRecommendedThe main knowledge content — can be paragraphs, lists, structured data
_keywordsRecommendedComma-separated semantic tags and synonyms for better search recall
[
{
"objectID": "ecom-conversion-rates",
"title": "Ecommerce Conversion Rate Benchmarks",
"content": "A good ecommerce conversion rate is typically between 2-5%. Top performers achieve 10%+. Mobile conversion rates are generally 1-2% lower than desktop. Key factors include page load speed, checkout friction, trust signals, and product photography quality.",
"_keywords": "conversion, rate, CVR, benchmark, ecommerce, mobile, desktop, checkout, funnel"
},
{
"objectID": "ecom-clv-calculation",
"title": "Customer Lifetime Value (CLV)",
"content": "CLV measures total revenue expected from a single customer relationship. Basic formula: Average Purchase Value x Purchase Frequency x Customer Lifespan. For subscription businesses, use: Monthly Revenue per Customer x Average Customer Lifespan in Months. Improving CLV by just 10% often has a larger impact than acquiring 10% more customers.",
"_keywords": "CLV, lifetime value, customer value, retention, revenue, LTV, churn, subscription"
}
]

When you publish a new version of your agent template, the system automatically:

  1. Exports all objects from your working index
  2. Creates an immutable snapshot index named agent-template-{slug}-v{version}
  3. Records the snapshot in the version metadata

Installers’ agents are pinned to the snapshot from the version they installed. When they update to a new version, their agent switches to the new snapshot.

Everything in your working index at the time of publishing. The snapshot preserves:

  • All objects with their full content
  • Search index settings (searchable attributes, faceting, semantic search)
  • The exact state of the data — no partial copies
  • Your working index continues to be editable after the snapshot
  • Changes to the working index after publishing do not affect the snapshot
  • The snapshot cannot be modified — it is immutable

Use descriptive IDs

Object IDs like ecom-metrics-001 are easier to manage than auto-generated UUIDs. Group related entries with a common prefix.

Keep entries focused

Each entry should cover one concept or topic. Agents search and retrieve individual entries, so focused content gets better results than long documents.

Include keywords liberally

The _keywords field powers semantic search. Include synonyms, abbreviations, related terms, and alternative phrasings that someone might use when searching.

Test before publishing

Search your training data from the portal to verify entries are findable. Try the queries your agents would use and check that relevant results appear.

For an overview of the shared agent system, see Shared Agents.

To learn about the marketplace in general, see the Marketplace Overview.