20 questions to ask before hiring an AI agents services company
The questions whose answers actually predict whether the engagement will work — with the right answer to listen for, organised by sales-call stage.
This is the long form of the question list in our pillar guide on how to choose an AI agents services company. The pillar had 15; this article has 20 with notes on the right answer to listen for and the dodge to watch out for.
Group them by stage. The first eight are for the discovery call. Six more for the technical or operator call. Six more for the procurement and contract stage.
Discovery call (questions 1–8)
1. What workflows have you shipped to production in the last 12 months that look like ours? Listen for specifics: industry, workflow type, scale. Watch out for: "we work across many verticals" — code for "we have not shipped yours."
2. Can you show one of your agents running on a slice of our real data within 2 weeks of an NDA? Listen for: yes, here is the paid pilot structure, here is what we will need from you. Watch out for: a 12-week discovery phase before any working artefact.
3. Who specifically would be the operator on our account, what is their background, and how many other accounts do they hold? Listen for: a name, a calendar, a cap of 4–8 accounts per operator. Watch out for: "the team."
4. What is your published pricing range for a workflow like this? Listen for: a range, here is what shifts it. Watch out for: "let me put together a custom proposal" without indicative numbers.
5. What are three specific failure modes your agents have in this kind of workflow? Listen for: hallucinated entities, edge cases on input X, rate-limit cascades, a specific story from a real engagement. Watch out for: "our agents are very reliable."
6. Where is our data stored and processed? Who are your sub-processors? Listen for: 60 seconds of named regions and sub-processors. Watch out for: "let me check with our team and get back to you."
7. What is the minimum contract length and what does termination look like? Listen for: 3-month initial term, 30-day rolling after, symmetric notice. Watch out for: 12+ month fixed lock without a clear build deliverable.
8. Tell me about a client you fired or who fired you, and why. Listen for: a real story with a real reason. Watch out for: "we have never had that happen." Vendors who have never lost a client either are very young or are not honest.
Technical / operator call (questions 9–14)
9. Walk me through your evaluation methodology with a redacted example. Listen for: test sets, ground truth, pass-rate history, regression handling. Watch out for: "we test things before we ship" with no artefact. The depth of the answer is the single best proxy for the quality of the eventual workflow. See AI agent evaluations explained.
10. Show me your observability portal as you would set it up for our workflow. Listen for: a live activity feed, per-run trace, cost line, replay. Watch out for: "we can pull a report for you."
11. Which actions does the agent take without a human in the loop? Where is the approval gate? Listen for: a specific list of tier 1 vs tier 2 vs tier 3 actions. Watch out for: vagueness, or "fully autonomous." See 12 red flags.
12. How do you handle a model deprecation or significant model change? Listen for: regression evals, written notification cadence, a specific historical story (Claude 3 → 3.5, GPT-4 sunset). Watch out for: "we will figure it out when it happens."
13. What does week 1, 2, and 4 of onboarding look like in writing? Listen for: a documented plan with named deliverables, including the first working version on real data by week 2. Watch out for: "kickoff and discovery, then we build."
14. How do we exit cleanly? What do we receive, in what format, and on what timeline? Listen for: prompts, workflow graph, evals, integration code, logs, in machine-readable format, within 10 business days. Watch out for: "we will work with you on the transition."
Procurement / contract stage (questions 15–20)
15. Will you assign IP in the prompts, workflow, and evals you build for us? Listen for: yes, with the vendor retaining rights to underlying platform and generic tooling. Watch out for: "we license it to you perpetually." That is not assignment.
16. Will you sign a DPA, name sub-processors in writing, and notify us before changing any of them? Listen for: yes, here is the DPA template, 14-day notification on changes. Watch out for: "we will look into it."
17. What KPIs will we agree to in writing, and how do they map to a business outcome? Listen for: one or two business outcomes (revenue, DSO, CSAT, first-response time) plus efficiency outcomes. Watch out for: only activity metrics (runs executed, uptime).
18. What is your SLA — first-response time on exceptions, throughput floors, remedies? Listen for: specific numbers, with credits or termination right tied to breach. Watch out for: "99.9% uptime" with no workflow-level commitment.
19. What is your security posture — SOC 2, pen test, breach notification window? Listen for: SOC 2 Type II at maturity, annual pen test, 72-hour breach notification. Watch out for: "we take security very seriously" with no artefacts. See AI agents security checklist.
20. Three references in our size range and vertical we can speak with, including at least one who has been on the service 12+ months. Listen for: names with email introductions inside a week. Watch out for: references who only started 3–6 months ago, or "we have to ask the clients first" delays beyond a week.
Questions about your own company you should answer first
The 20 questions above are for the vendor. Before any of them are useful, answer these five about your own company — vendors will ask, and your answers shape which vendor is right.
What is the specific workflow and what is the success metric? "Our outbound is not working" is not a brief. "Our outbound produces 12 qualified opportunities per month and we want to triple that with the same headcount, measured as pipeline-qualified opportunities in HubSpot tagged source=outbound" is a brief.
Who is the internal owner and what is their bandwidth? A name and a number of hours per week. If you cannot answer, fix that before any vendor call. The engagement will fail without it.
What is the data we will share, where does it live, and what are the constraints? CRM, help desk, document store, accounting. PII, financial data, customer data, contracts. The vendor's data posture has to match yours; the conversation is shorter when you know your constraints in advance.
What is the budget envelope? Not "what does it cost" — what can you actually spend without re-approval. Vendors will adapt scope; they cannot do so unless you share the envelope.
What is the timeline to first value, and what does "value" mean to the person who signs the contract? A founder's "value" is often different from a CFO's. Align internally before the sales call.
How to use the answers
Score each vendor 0–2 per question: 0 for a dodge, 1 for a partial answer, 2 for a specific answer with evidence. Maximum 40.
Above 32: strong fit, the vendor has shipped this before and will be a fair partner. 24–32: probably workable but interview at least one alternative. Below 24: keep looking. The threshold for walking is rarely a single bad answer; it is the pattern of vagueness across multiple questions.
Pair this with the red-flag scan in 12 red flags when evaluating an AI agents services company and the decision checklist in the pillar guide.
A note on follow-up questions
The 20 questions surface the headline answers. The real evaluation lives in the follow-ups. Three patterns of follow-up are particularly useful.
"Walk me through a specific instance." Any abstract answer ("we monitor closely," "we have strong evals") is easier to give than a specific story. Following up with "tell me about the last time that failed in a client engagement and what you did" separates rehearsed answers from lived ones.
"What would the operator do if X happened next week?" Replace X with a realistic operational scenario: a model deprecation, a 50% volume spike, a regulatory inquiry, a key integration breaking. Vendors who have run real workflows have plausible answers; vendors who are improvising do not.
"What is the answer your previous client was unhappy with?" Every vendor has at least one client who left or escalated. The vendor's account of why — and what they changed — is a strong signal of self-awareness. Vendors who claim no such client exists are either very young or not honest.
Where Logitelia fits
If you want to run these 20 questions on Logitelia, book a 30-minute call. We will answer all 20 in plain language and tell you where our answers are weaker than the ideal — every vendor has weaker answers somewhere, and the honest ones tell you which.
Want to run these 20 questions on Logitelia? We will answer all of them — including where our answers are weaker than the ideal.
Book intro call