By Saurav | Founder of saavos | Building in public toward $10k MRR
[!TLDR] Most buyers evaluate AI chatbot vendors on price and integration count. The right buyers ask 12 specific questions that separate platforms built to last from platforms built to impress on a demo. These questions reveal how the bot behaves when it does not know the answer, who actually owns your data, what the real cost looks like at your conversation volume, and which channels you will never get support for. Below: the 12 questions, what good answers look like, the red flags that should stop a procurement, and exactly how saavos answers — honestly, including the two questions we do not win on.
Buyer regret on AI chatbots follows a pattern. At day one, the demo looked clean and the pricing felt reasonable. At day 30, the bot is confident on questions your docs do not cover and you are not sure how to stop it. At day 60, you receive an invoice for overage charges you did not know were in the contract. At day 90, the 4,000 conversations you paid for are locked in the vendor's dashboard and you cannot get them out.
These are not random failures. They are predictable consequences of questions nobody asked during evaluation.
The 12 questions below are not gotchas. They are the minimal information you need to make a defensible buying decision. Each one has a pattern of good answers and a pattern of red flags. Run them in writing before signing anything.
Why it matters. The underlying model is the primary determinant of answer quality. Vendors who name the model specifically (Claude Sonnet 4.6, GPT-4o mini) are giving you verifiable information. Vendors who say "powered by AI" or "our proprietary engine" are withholding information about a core product decision. The difference in answer quality between a frontier model and a fine-tuned smaller model is measurable and matters at the ambiguous questions your visitors actually ask.
What good looks like. The vendor publishes a clear model-per-tier table: free tier uses Model X, starter tier uses Model Y, professional tier uses Model Z. No vague language. The model names match publicly available models you can look up.
Red flag. "Our AI" with no further specification. Or "a blend of models" without naming them. Or a model that was frontier two years ago and has been superseded.
How saavos answers. Haiku 4.5 on the free tier, Claude Sonnet 4.6 on paid tiers. Both are current Anthropic models. We publish this in the pricing page and it matches the actual API calls the platform makes.
Why it matters. Every chatbot fails on some percentage of queries. The question is not whether it will fail; it is how it handles failure. A bot that confidently makes up an answer is worse than a bot that says nothing. A bot that dead-ends the visitor with "I cannot help with that" leaves them more frustrated than if they had found your contact page themselves. The fallback message is the single best predictor of whether customers are still using the platform in six months.
What good looks like. The fallback is fully configurable by the owner: custom text, a specific email address or booking link, and an honest "I do not have that in our documentation" framing. The bot does not hallucinate an answer when retrieval returns nothing useful.
Red flag. The fallback is hardcoded and cannot be changed. Or the bot will attempt to answer anyway using general knowledge even when no relevant source exists. Or there is no way to distinguish "I found sources but the answer was ambiguous" from "I found nothing."
How saavos answers. Fallback message is fully configurable per bot. You write the message, include any link or email you want, and the bot uses it whenever retrieval confidence is below threshold. We do not let the bot generate answers from general knowledge when no relevant source is retrieved.
Why it matters. Visitors do not stay inside the boundaries of your documentation. Someone reading your pricing page will ask "how does this compare to Intercom?" Someone in your dashboard will ask "what is the best way to write a help article?" If the bot answers those questions from general knowledge — without disclosing it is no longer drawing on your sources — it is presenting its own reasoning as your product's answer.
What good looks like. The bot either (a) stays strictly in-scope and routes out-of-scope questions to your fallback, or (b) clearly signals to the visitor that it is leaving your documented sources and drawing on general model knowledge.
Red flag. The bot answers confidently on out-of-scope topics with no signal to the visitor that it is not sourced from your documentation. This is the category that generates the most public embarrassment stories — bots committing to pricing they do not have, policies that do not exist, or competitor comparisons the owner never approved.
How saavos answers. saavos does not do web search or out-of-scope general answering. The bot draws only from sources you provide. If a visitor asks something outside your docs, it hits the fallback message you configured. This is a deliberate trade-off: it means saavos will not answer "what is the weather in London" (Intercom Fin's Copilot does things like that), but it also means the bot will never make up an answer outside your documentation. For most SMB support and FAQ use cases, staying in-scope is the safer product decision.
Why it matters. Most AI chatbot vendors have short refund windows or none at all, because API costs are incurred at the moment the conversation happens. Knowing the refund policy before you pay tells you how much risk you carry if the platform does not work for your use case, and it reveals something about how the company treats customers who are unhappy.
What good looks like. A clear policy, in writing, published on the pricing page. Even "no refunds on monthly plans after first use, full refund within 7 days of no usage" is better than nothing because it is explicit.
Red flag. "Contact support" as the only answer. No mention of refunds anywhere in the pricing or terms. Or a refund policy buried in a 40-page terms of service document with no summary. These patterns indicate the company has not thought about customer trust at the policy level.
How saavos answers. We offer a full refund within 7 days of first payment if you have not exceeded the free tier quota. This is published on the pricing page. We are a pre-revenue product and we want you to be able to test with low risk.
Why it matters. Per-resolution pricing (Intercom Fin's model: roughly $0.99 per resolved conversation) sounds reasonable until you have a busy month. A flat subscription at $49/month is predictable. A per-resolution model at $0.99 × 500 conversations is $495 plus the platform seat fee. At 1,000 conversations it is nearly $1,000, and a viral Reddit post can send that bill into four figures with no warning. Flat pricing trades coverage flexibility for cost predictability; per-resolution pricing trades cost predictability for scale alignment.
What good looks like. A published price per tier with a clear message or conversation quota. Overages are explicitly described — does the bot pause at the quota, or does it keep running and bill you? Either answer is acceptable; opaque behavior is not.
Red flag. "Contact us for pricing at scale" on the primary paid tier. Per-resolution pricing without a cap. A free trial that auto-upgrades to a paid tier with no email confirmation. Or message quotas that are not published and you have to discover by reading the fine print.
How saavos answers. Flat monthly pricing. The message quota is published on the pricing page. When you hit the quota, the bot pauses — it does not keep running and bill you for overages. Upgrade in-product; no sales call required.
Why it matters. Conversation data contains questions your customers asked, context about their problems, and sometimes personally identifiable information if visitors voluntarily share it. Where that data lives determines which privacy regulations apply to you, what your legal exposure is in a breach, and what you are committing your customers to when you deploy the bot on their behalf.
What good looks like. A specific answer: "All conversation data is stored in Postgres on AWS us-east-1 with encryption at rest." Most buyers do not need more than a region and a major cloud provider name. EU-based customers additionally need to know whether data crosses EU borders and whether the vendor has a Data Processing Agreement.
Red flag. "In the cloud" with no further detail. No Data Processing Agreement available. Or a DPA that exists but is only available on enterprise plans, meaning smaller customers operate without one.
How saavos answers. Conversation data lives in Supabase Postgres (hosted on AWS). We are a US-based service. A Data Processing Agreement is available on request. We do not have a specific EU data residency option today — EU-based buyers should factor that into their decision.
Why it matters. Your conversation logs contain your customers' real questions, your product specifics, and potentially competitive intelligence. If the vendor uses that data to train shared models, your customers' conversations are improving a product that your competitors also use. Most enterprise buyers would terminate a contract over this clause; most SMB buyers never ask.
What good looks like. An explicit no, backed by the Data Processing Agreement. "We do not use your conversation data to train any shared or public model. Your data is used only to serve your chatbot." Bonus: the vendor can point to the specific API calls (Anthropic API, OpenAI API) and confirm they are using the paid API — both providers do not train on paid API submissions.
Red flag. Vague language like "we may use data to improve our services." Or no mention of training practices in the terms at all. Or a data retention policy that stores conversations indefinitely with no user-controlled deletion.
How saavos answers. We use the Anthropic API and OpenAI API for generation and embeddings respectively. Neither Anthropic nor OpenAI trains on paid API submissions. We do not train any shared model on your conversation data. Conversation history is stored in your dedicated database tables and is not shared with other tenants.
Why it matters. An AI chatbot without citations is asking visitors to trust it on faith. In 2026 that is a significant ask — AI hallucination is a known problem and informed visitors know it. Citations are not just a UX nicety; they are the mechanism by which a visitor can verify a claim in two clicks, and they are the mechanism by which you as the owner can audit which source pages are generating which answers.
What good looks like. Inline citation markers next to factual claims in the answer (not just a "Sources:" footer), a collapsible source list showing the page title and URL, and clickable links so the visitor can verify anything in the answer. Owner-side: the same source attribution visible in the conversation log so you can diagnose retrieval quality.
Red flag. No citations at all. Or a "Sources" section at the bottom of the conversation with no per-claim connection. Or citations that link to the top of the source page rather than the specific chunk — this is better than nothing but means visitors still cannot quickly verify individual claims.
How saavos answers. Inline citation markers with a collapsible source list. Each citation links to the source URL. The conversation log visible to you as the owner shows the source URLs retrieved per answer. This is on by default; it does not require configuration.
Why it matters. Conversation logs are your most valuable data asset from a support and product standpoint. They tell you what your customers are confused about, what your documentation is missing, and how your support patterns shift over time. If you cannot export that data, you are locked into the vendor's analytics tooling forever, and you cannot take historical data with you if you switch.
What good looks like. CSV or JSON export available in the dashboard, covering all conversations in a date range, with visitor messages, bot replies, source citations, and timestamps. No per-export fee. Export is available on all paid tiers, not just enterprise.
Red flag. Export gated on enterprise plans. No export at all — "use our analytics dashboard" as the only option. Or export that includes bot replies but strips out source attribution, making it useless for retrieval quality analysis.
How saavos answers. Conversation export is on the product roadmap. It is not shipped yet as of May 2026. Conversation data is stored in your Postgres tables and we will provide export tooling before any paid tier reaches full capacity. If this is a dealbreaker for you now, factor it in — it is fair.
Why it matters. Every vendor claims their product is fast to set up. "Minutes" is a marketing word. The real number includes account creation, URL submission, crawl time (which can vary from instant to 24 hours depending on site size and platform queue depth), widget configuration, customization to match your brand, and script installation on your site. The difference between a 5-minute deploy and a 3-day deploy changes your rollout plan entirely.
What good looks like. A specific set of steps with realistic time estimates: crawl completes in under 5 minutes for a 50-page site, widget can be on your site within 10 minutes of signup, no waiting for a sales call or human review before going live.
Red flag. A "book a call" required before you can see the product. Crawl queues that take 24 hours. A setup wizard that ends at "our team will reach out to complete your onboarding." Or a demo that is available instantly but production access requires a contract.
How saavos answers. Signup, paste your URL, crawl typically completes in under 2 minutes for sites under 100 pages, copy one script tag, paste it in your site footer. First live chat within 5 minutes of signing up, including the free tier with no credit card required. The product is self-serve from first visit to embed.
Why it matters. Vendor integrations lists are marketing. They emphasize what is supported, not what is missing. The right question is what channels the bot will never reach. If you need Slack, WhatsApp, or live-agent handoff, you need to know that before you sign, not after you have embedded the widget and a customer asks why they cannot message you on Instagram.
What good looks like. An honest, specific list of what the platform does not support. "We are a website widget only — no live chat, no Slack, no WhatsApp, no mobile SDK" is a useful answer. "We support 1,000+ integrations" is not.
Red flag. No explicit mention anywhere of what the platform does not do. An integrations page that lists connectors for everything but is short on detail. Or a sales rep who deflects the question with "what channels are you looking for?" — that is a stall, not an answer.
How saavos answers. We are a website embed widget only. We do not support live chat, Slack, WhatsApp, Facebook Messenger, SMS, mobile SDK, or any channel outside an embeddable web widget. If you need those channels, Intercom, Freshdesk, or Zendesk are better choices. We are optimized for one thing: a knowledgeable, citation-backed chatbot on your website. That is the scope and it is intentional.
Why it matters. Enterprise chatbot platforms are built for enterprise workflows: ticketing systems, SLA tracking, multi-agent inboxes, role-based access control, compliance audit logs. Deploying an enterprise platform for a 2-person team means paying for infrastructure you will never use and fighting configuration complexity that assumes a team of 10 admins and a dedicated IT department. Conversely, a product optimized for solopreneurs may hit real walls the moment you have a second person who needs dashboard access.
What good looks like. The vendor says, plainly, what team size they are built for. "We are optimized for solo founders and 1-5 person teams" is a better answer than "we scale from SMB to enterprise" because it tells you where the product attention actually lives.
Red flag. "We scale to any size" with no specific floor. An onboarding experience that routes you to a human sales rep if you have fewer than 50 seats. Pricing that starts at $99/seat/month with a 3-seat minimum — that is a $3,600/year floor which is not an SMB product regardless of the marketing copy.
How saavos answers. We are explicitly built for solopreneurs and teams of 1 to 5. The free tier is designed to be genuinely useful, not just a trial. The paid tiers are priced assuming you are pre-revenue or early-revenue and care about cost predictability. We do not have a sales team, a required onboarding call, or a minimum seat count.
Print this before your next vendor evaluation call. For each vendor, write in the answer column. Red flags in any row should pause the conversation.
| # | Question | Red flag to watch for | Good answer looks like |
|---|---|---|---|
| 1 | What model powers the bot, per tier? | "Our AI" with no model name | Named model (Claude, GPT-4o) per tier |
| 2 | What happens when the bot does not know? | Hardcoded "I cannot help" | Fully configurable fallback with CTA |
| 3 | Does it answer outside my docs? | Answers confidently with no source signal | In-scope only, or clear "general knowledge" signal |
| 4 | What is your refund policy? | "Contact support" / no published policy | Clear window, in writing, on pricing page |
| 5 | Flat or per-resolution pricing? | Per-resolution with no cap | Flat monthly, quota published, pauses at limit |
| 6 | Where is customer data stored? | "In the cloud" with no region | Specific region, cloud provider, DPA available |
| 7 | Do you train models on my data? | "May use data to improve services" | Explicit no, backed by DPA |
| 8 | How do citations work? | No citations at all | Inline markers, collapsible source list, clickable URLs |
| 9 | Can I export conversations? | Export gated on enterprise | CSV/JSON export on all paid tiers |
| 10 | Time from signup to first live chat? | Requires sales call or human review | Self-serve, under 10 minutes, no card for free tier |
| 11 | What do you NOT support? | "1,000+ integrations" with no specifics | Explicit list of unsupported channels |
| 12 | Smallest team you are built for? | No floor, "scales to any size" | Named minimum — "1-5 person teams" |
If saavos's answers on Q1, Q2, Q4, Q5, Q7, Q8, Q10, and Q12 match what you need, the free tier is worth 10 minutes of your time. No credit card. No onboarding call. Paste your URL, embed the widget, and see how it answers your visitors' real questions.
Start free at saavos — no card required
The free tier does not expire. If it works for you, the paid tier is the same product with a higher message quota.
Get the next post in your inbox
Honest writing on building, embedding, and shipping AI chatbots. No spam. Unsubscribe anytime.
Ask what happens when the bot does not know the answer. The fallback handling is the single best predictor of 90-day customer satisfaction. A bot with a fully configurable fallback that routes visitors to a real human or email keeps customers; a bot with a hardcoded "I cannot help" message drives them straight to your inbox more frustrated than if the chatbot did not exist. Every other capability is secondary to getting this right.
Ask directly and ask for it in writing via a Data Processing Agreement. Vague language like "we may use data to improve our services" is a red flag — it means yes. Specific language like "we do not use customer conversation data to train any shared or public model" backed by a DPA is what good looks like. Platforms using the Anthropic API or OpenAI API at the paid tier inherit those providers' no-training commitment by default; self-hosted or fine-tuned model vendors need to be asked explicitly.
Flat pricing charges a fixed monthly fee regardless of how many conversations happen. Per-resolution pricing (Intercom Fin's model) charges per successfully resolved conversation — roughly $0.99 each at Intercom's current rates. For a predictable 200 conversations per month, per-resolution costs $198 plus the platform seat fee. For a viral month at 2,000 conversations, it costs $1,980. Flat pricing caps your upside but also caps your downside; per-resolution pricing aligns vendor incentives with your success but creates unpredictable invoices. For most pre-revenue or early-revenue SMBs, flat is the right choice.
For a self-serve platform on a standard site, under 10 minutes: account creation (1 minute), submit your URL and wait for crawl completion (1–5 minutes for sites under 100 pages), configure fallback message and widget colors (2 minutes), copy script tag and paste into site footer (1 minute). Any step that requires a sales call, human review, or a waiting period of more than a few minutes is an enterprise onboarding pattern, not an SMB product. Free tiers should be accessible without a credit card — if a card is required for the free tier, the "free" claim is misleading.
Builds tools for solopreneurs and small SaaS teams who don't have an afternoon to spare.
Paste your URL. Train your bot. Drop one script tag. No credit card.