Improve your bot.
Most quality problems have a simple cause: the answer is not in the sources, or the source content is structured in a way that makes retrieval unreliable. This guide walks through the levers you have — before and after launch.
What goes wrong and why.
Bot says “I don't have that information” for questions you expect it to answer
This means retrieval found no chunks above the similarity threshold. Either the information is not in any source, or the source page was not ingested correctly. Check: (1) Is the relevant URL in your sources and in Ready status? (2) Is the answer actually on that page? (3) Is the page publicly accessible — not behind a login or rendered only by JavaScript?
Bot gives outdated answers
The source was ingested at a specific point in time. If the content changed, delete the source and re-add the URL — this triggers a fresh crawl and re-embedding of the new content.
Bot mixes up answers from different topics
This is a retrieval problem — the wrong chunks are being selected. Common cause: a single very long page mixing multiple topics. Split it into multiple focused pages and add each separately. The chunker works best with pages dedicated to one topic.
Bot answers questions it should not
The system prompt instructs the model to answer only from context. If the bot still answers off-topic questions, it means there is close-enough content in your sources. Customize the system prompt (Dashboard → Bot Settings) to be more explicit about scope.
Better sources, better answers.
Add dedicated pages, not homepages
Homepages and navigation-heavy pages are poor sources. The chunker strips navigation and boilerplate, leaving little content. Instead, add your actual content pages: help articles, product detail pages, policy pages, FAQ pages.
A page with 500–3000 words of focused prose makes a better source than a 10,000-word page covering 20 different topics.
Use text sources for dynamic content
If the content is maintained in a database or rendered by JavaScript (e.g., a React help center), the URL crawler will not see it — it only fetches static server-rendered HTML.
For these cases, copy the text and add it as a text source directly. Update it manually when the content changes.
Use PDFs for structured documents
Product manuals, specification sheets, and terms documents are often better as PDF uploads than URL sources. The PDF parser handles structured layouts more reliably than the HTML extractor on dense documents.
PDFs must be text-based (not scanned). Scanned PDFs produce no extractable text — the source will show an error.
Re-ingest after updates
Sources are not automatically re-crawled when the underlying page changes. Delete a source and re-add it to get fresh content. You can also click Retry on a source in error state, but this re-runs ingestion on the original URL — it picks up any changes.
Customizing bot behavior.
Go to Dashboard → Bot Settings to edit the system prompt. The default prompt tells the bot to answer only from context and use a helpful tone. You can extend it with:
Scope restriction
Add: “Only answer questions about [topic]. For anything else, say: ‘That is outside my scope — contact us at [email].’” This makes the fallback more specific and reduces off-topic answers.
Tone and persona
Add: “Be concise. Answer in 2–3 sentences unless a longer explanation is essential.” Or: “Use a friendly, informal tone. Avoid jargon.”
Escalation hint
Add: “If the visitor seems frustrated or asks to speak to a human, tell them to email [email] or use [contact URL].”
Fallback message
The fallback message (shown when no context is retrieved) is separate from the system prompt. Set it in Bot Settings → Fallback message. Make it actionable: include a contact method.
How retrieval works (and how to help it).
The retrieval threshold is currently fixed at 0.3 cosine similarity. Chunks below this score are discarded before generation. If the bot frequently falls back on questions it should be able to answer, the issue is source quality — not the threshold.
The most effective way to improve retrieval is to ensure the language in your sources closely matches the language visitors use in questions. If your docs say “return merchandise authorization” but visitors ask “how do I return something,” add a text source that explicitly maps the two phrasings together.
Test before you launch.
Use the built-in test chat
In the dashboard, go to your bot and click Test chat. Send the 10 questions you expect visitors to ask most. Check the answers and the cited sources. Every good answer should cite a source you recognise.
Watch conversation logs after launch
Go to Dashboard → Conversations. Review the first week of real conversations. Look for patterns: questions that always fall back, topics visitors ask about that are not in your sources, citations pointing to unexpected pages.
Iterate on sources, not just prompts
Most quality problems are fixed by improving sources (adding, removing, or rewriting content) rather than by adjusting the system prompt. Start with sources.
Last updated 2026-05-13 · Was this helpful?