AI Knowledge Base for Customer Support: Cutting Resolution Time by 60%
Every customer support team has the same problem: the answers to 60–70% of incoming tickets already exist somewhere in the organisation. In a Confluence page, a SharePoint document, a Zendesk article, a training manual, an old email thread. The support agent just cannot find them quickly enough — so tickets queue, customers wait, and agents burn time on the same questions answered a dozen times before.
An AI knowledge base built on RAG architecture solves exactly this problem. It makes the organisation's collective knowledge instantly searchable and synthesises answers in seconds. The support agent stops searching and starts resolving.
Here are the real numbers, the technical architecture, and what this actually costs to build and run.
What the System Does
The AI knowledge base is not a chatbot and it is not a keyword search upgrade. It is a retrieval-augmented generation system that:
- Ingests your existing support knowledge (Confluence, SharePoint, Zendesk articles, manuals, SOPs)
- Indexes that knowledge using semantic vector embeddings
- When a support ticket arrives, retrieves the most relevant knowledge chunks
- Synthesises a precise, cited answer using a large language model
- Applies confidence scoring to decide whether to auto-answer, suggest to the agent, or escalate
The output is an answer — with source citations — delivered to the agent in the ticket interface within 2–4 seconds of ticket creation.
Technical Architecture
Knowledge Ingestion Sources
The system ingests from wherever your knowledge actually lives:
Confluence Confluence REST API (v2) returns pages as structured JSON. The ingestion pipeline extracts body content, page title, space, labels, and last-modified date. Pages marked as archived or draft are excluded. For a typical 500-page Confluence space, full ingestion takes 15–25 minutes on initial run; incremental sync runs every 4 hours to pick up edits.
SharePoint Microsoft Graph API provides programmatic access to SharePoint libraries. The pipeline extracts Word documents (.docx), PDFs, and OneNote pages. File types not suitable for text extraction (images, videos, spreadsheets) are excluded from the knowledge base but logged for manual review.
Zendesk Help Centre Zendesk's Help Centre API exports published articles with their associated section hierarchy. Articles flagged as "needs review" or unpublished are excluded. The section hierarchy is preserved in metadata — this matters for retrieval, because an article titled "Reset Password" means different things in different product sections.
Ticket history Resolved Zendesk tickets with positive CSAT scores (rated 4–5 by the customer) are extracted via the Zendesk Search API and ingested as question-answer pairs. This is the highest-value knowledge source — it captures institutional knowledge that has never been formally documented. Typical mid-market support team with 3+ years of history yields 8,000–25,000 high-quality QA pairs.
Custom uploads PDF manuals, onboarding guides, product specifications, and compliance documents uploaded manually to the system's document library.
Chunking and Indexing
Support knowledge has different structural characteristics from legal contracts. Confluence pages and help articles tend to be shorter (500–2,000 words) and more self-contained, while manuals and SOPs can be 50+ pages.
Chunking strategy for support knowledge:
- Confluence pages under 1,500 words: indexed as single chunks with full content
- Longer pages and documents: semantic section-boundary chunking with 512-token maximum chunk size and 15% overlap
- Zendesk ticket QA pairs: indexed as complete pairs (question + resolution), never split
- Product manuals: fixed 256-token chunks with heading context prepended to each chunk
All chunks are tagged with metadata: source system, document title, section, last-modified date, and product area tags (derived from Confluence labels or Zendesk ticket tags).
Embedding and Vector Store
We use text-embedding-3-large for support knowledge embedding, consistent with our other RAG deployments. The additional retrieval accuracy on support queries (vs text-embedding-3-small) is particularly important because support queries are often short, imprecise, and grammatically informal — "how do I add a user to the portal" — where larger embedding dimensions capture semantic intent more reliably.
Vector store: Azure AI Search with hybrid search (vector + BM25 keyword). Hybrid search is essential for support use cases because customers frequently use product-specific terminology, version numbers, or error codes that benefit from exact keyword matching alongside semantic similarity.
Confidence Scoring and Routing Logic
This is the architectural decision that most directly determines business outcomes.
The system generates a confidence score for each retrieved and synthesised answer. The score is derived from:
- Vector similarity of the top retrieved chunks to the query (weighted 40%)
- Number of retrieved chunks supporting the same answer (weighted 30%)
- Recency of source documents — answers from documents modified more than 18 months ago receive a freshness penalty (weighted 20%)
- Answer completeness check — does the synthesised answer actually address the query type identified by a classifier? (weighted 10%)
Routing thresholds (tuned from production data):
Confidence Score Action
≥ 92% Auto-answer: response sent to customer with "Answered by our support system" attribution. Ticket flagged for spot-check review.
75–91% Suggest to agent: answer draft pre-populated in Zendesk reply field, agent reviews and sends (or edits)
50–74% Partial assist: relevant knowledge articles surfaced to agent, no draft generated
< 50% Escalate: ticket routed to senior agent or specialist queue with retrieved context attached
The 92% auto-answer threshold is deliberately conservative. We calibrate it during the first four weeks of deployment by reviewing a sample of auto-answered tickets against customer satisfaction scores. In most deployments, the threshold can be raised to 94–95% after calibration without accuracy degradation.
Zendesk Integration
The system integrates with Zendesk via the Zendesk Apps Framework (ZAF) and Zendesk Triggers API:
- New ticket trigger: fires immediately on ticket creation, sends ticket content to the RAG pipeline
- ZAF sidebar app: displays the retrieved answer, confidence score, and source citations directly in the Zendesk agent interface
- Auto-answer path: uses the Zendesk Ticket Update API to post a public reply and close the ticket (or move to "Pending" depending on configuration)
- Feedback loop: agent "thumbs up/down" on suggested answers feeds back into the training pipeline
Intercom and ServiceNow Integration
For Intercom: the system integrates via Intercom's Conversation API. Incoming messages trigger retrieval; confident answers are posted as bot replies before human handoff.
For ServiceNow: integration uses the ServiceNow REST API (Table API). The RAG system is exposed as a scripted REST API endpoint that ServiceNow calls on incident creation. Results are written back to the work notes field.
Real Performance Metrics
These figures are from a Techseria deployment for a 60-person UK SaaS company supporting approximately 800 customers, ticket volume ~1,200/month:
Before AI knowledge base:
- Average first-response time: 4.2 hours
- First-contact resolution rate: 34%
- Average handle time per ticket: 22 minutes
- Tickets escalated to senior support: 31%
- Support team headcount: 6 agents
After AI knowledge base (measured at 90 days post-deployment):
- Average first-response time: 23 minutes (81% reduction)
- First-contact resolution rate: 61% (79% improvement)
- Average handle time per ticket: 9 minutes (59% reduction)
- Tickets escalated to senior support: 14% (55% reduction)
- Support team headcount: 6 agents (unchanged — see below)
Auto-answer rate at 90 days: 34% of tickets fully resolved without agent involvement, with 96.1% customer satisfaction on auto-answered tickets (vs 91.4% on agent-handled tickets at baseline).
What Happens to Support Headcount
The common fear is that AI-powered support means redundancies. In every deployment Techseria has delivered, this has not been the outcome — and the reason is straightforward.
The AI knowledge base reduces handle time per ticket from 22 minutes to 9 minutes. For a team handling 1,200 tickets per month, this releases approximately 260 agent-hours per month. Those hours are not surplus — they are redeployed:
- Proactive customer success outreach (reaching out to customers at risk of churn, based on ticket patterns)
- Onboarding support for new customers (calls and walkthroughs that were previously neglected due to queue pressure)
- Knowledge base curation (reviewing and improving articles, ensuring the AI knowledge base stays accurate)
- Complex case handling (the 14% of tickets that require specialist knowledge now get faster, better attention)
The outcome is better customer outcomes with the same headcount — not the same customer outcomes with less headcount. If the business is growing and ticket volume is increasing, the AI knowledge base means you can scale without proportionally scaling the support team.
Build Cost and Timeline
Build cost range: £18,000–£28,000
The range reflects integration complexity:
- £18,000: Single source (Zendesk only) + Zendesk integration
- £22,000: Multi-source (Confluence + Zendesk + SharePoint) + Zendesk integration
- £28,000: Multi-source + multi-platform integration (Zendesk + Intercom or ServiceNow) + custom confidence calibration + feedback loop pipeline
What is included:
- Knowledge ingestion pipeline (all configured sources)
- Chunking, embedding, and Azure AI Search indexing
- RAG retrieval and synthesis layer
- Confidence scoring and routing logic
- Integration with ticketing platform(s)
- Agent interface (ZAF app or equivalent)
- Auto-answer pipeline with approval workflow
- 4-week calibration period post-deployment
- Documentation and handover
Ongoing maintenance cost: minimal
Once deployed, the system requires:
- Scheduled knowledge re-sync (automated, runs every 4 hours): zero manual effort
- Monthly confidence threshold review (30 minutes of analyst time)
- Quarterly knowledge audit to identify outdated articles being retrieved (2 hours)
- Azure infrastructure: £150–£250/month
The knowledge base does not degrade without active maintenance — but it also does not improve without it. The feedback loop (agent ratings) provides continuous signal for model fine-tuning if the client invests in quarterly optimisation cycles.
Is Your Support Operation Ready?
The AI knowledge base delivers strongest results when:
- You have at least 200 documented knowledge articles, resolved tickets, or SOPs to seed the system
- Your ticket volume is 400+ per month (below this, the payback period extends)
- Your support team uses a structured ticketing platform (Zendesk, Intercom, ServiceNow, Freshdesk)
- A meaningful proportion of your tickets are repetitive (rule of thumb: if your agents frequently copy-paste the same reply, the AI will handle it)
It delivers weaker results when:
- Your knowledge is primarily tacit (experienced agents with no documentation)
- Tickets are predominantly unique, complex, or require access to live account data not connected to the system
- Your documentation is significantly outdated (more than 30% of articles have not been reviewed in 2+ years)
Before building, Techseria conducts a knowledge audit to assess your starting position — this is a free 2-hour engagement that tells you whether the investment will pay back in under 12 months.
Talk to us at techseria.com or [email protected]. We will assess your support knowledge base and give you a realistic projection of first-contact resolution improvement before you commit.
Questions Support Leaders Ask Before Building
What if our knowledge base is out of date — will the AI give wrong answers? Yes, and this is the most important maintenance consideration. An AI knowledge base is only as accurate as the knowledge it indexes. The system includes two mechanisms to manage this. First, freshness scoring penalises answers sourced from documents not updated in 18+ months, lowering the confidence score and reducing the chance of auto-answering from stale content. Second, the monthly knowledge audit process identifies the 20–30 articles most frequently retrieved in low-confidence answers — these are the articles most in need of review. Keeping those articles current is more valuable than maintaining hundreds of rarely-accessed articles.
Can the system learn from our agents' corrections over time? Yes, via the feedback loop. When an agent edits a suggested answer before sending, that edit is logged. When an agent rates a suggestion with a thumbs-down, that signal is captured. These signals feed a monthly fine-tuning review where the chunking, retrieval parameters, and prompt templates are adjusted based on accumulated feedback. Businesses that invest in this monthly review cycle typically see another 8–15% improvement in first-contact resolution rates between months 3 and 12.
What about tickets that require access to live account data — like subscription status or order history? The knowledge base RAG system answers policy and process questions — it does not have access to transactional account data by default. For tickets that require "what is customer X's subscription status," a separate integration is needed: connecting the RAG pipeline to your CRM or billing system API so account context can be retrieved alongside knowledge base answers. This is a separate integration layer we build as an extension, typically adding £5,000–£8,000 to the base build cost. With this extension, the AI can answer "your subscription renews on 14 March and includes access to the Enterprise tier features" with the same speed as a policy answer.
How do we handle tickets in multiple languages? GPT-4o handles multilingual input natively. A support ticket submitted in French, German, Spanish, or Arabic is processed, matched to relevant knowledge, and synthesised in the same language as the incoming ticket. For organisations with significant multilingual support volume, we recommend separate language-specific knowledge bases rather than a single multilingual index — retrieval precision is higher when the indexed content language matches the query language. Separate language indexes add approximately £3,000–£5,000 to the build cost per additional language.
Ready to accelerate your operations?
See how custom AI solutions, ERPNext integration, and workflow automations can lower your operating costs. Book your free 30-minute Workflow Audit with a senior engineer.


