AI Service Architecture for Self-Help System

Overview

The Help Desk’s AI-powered self-help system is designed to provide instant, intelligent IT support while minimizing costs through a sophisticated multi-tier response strategy. This document explains the technical architecture, implementation details, and design decisions behind the AI service.

Architecture Components

1. Core AI Service (`aiService.ts`)

The main AI orchestrator that manages the entire self-help workflow:

class AIService {
  private geminiModel: GenerativeModel;
  private contextCache: Map<string, CachedContext>;
  private knowledgeBase: Map<string, KnowledgeArticle>;
}

Key Responsibilities:

Query categorization and routing
Context management for conversations
Integration with Google Gemini AI
Dynamic content personalization
Token usage optimization

2. FAQ Service (`faqService.ts`)

A high-performance FAQ matching system that serves as the first line of response:

class FAQService {
  private fuse: Fuse<FAQ>;  // Fuzzy search engine
  private exactMatchCache: Map<string, FAQ>;
  private faqs: FAQ[];  // Loaded from Firestore
}

Key Features:

Firestore Integration: Dynamic FAQ management with real-time updates
Multi-tier Matching:
1. Exact match (100% confidence)
2. Keyword match (80-95% confidence)
3. Fuzzy match (60-80% confidence)
Usage Analytics: Tracks FAQ effectiveness
Category Organization: IT-specific categorization

3. Cache Service (`cacheService.ts`)

Intelligent caching system for AI context optimization:

class CacheService {
  private cache: Map<string, CachedEntry>;
  private readonly maxSize = 100;
  private readonly ttl = 3600000; // 1 hour
}

Benefits:

Reduces redundant AI API calls
Maintains conversation context
Implements LRU eviction strategy
Saves ~70% on repeat queries

Query Processing Flow

Step 1: Initial Query Analysis

graph TD
    A[User Query] --> B{FAQ Match?}
    B -->|Yes| C[Return FAQ Answer]
    B -->|No| D[Categorize Query]
    D --> E[Load Context]
    E --> F[Generate AI Response]

Normalize Query: Clean and standardize user input
FAQ Search: Check against Firestore-backed FAQ database
Confidence Check: If match confidence > 70%, return FAQ
Fallback to AI: Process with Gemini for complex queries

Step 2: AI Processing Pipeline

async processWithAI(query: string, config: Config): Promise<AIResponse> {
  // 1. Categorize the query
  const category = this.categorizeQuery(query);

  // 2. Load relevant knowledge base
  const context = await this.loadContextForCategory(category);

  // 3. Check cache for similar queries
  const cached = this.cacheService.get(query);
  if (cached) return cached;

  // 4. Generate AI response
  const response = await this.generateResponse(query, context);

  // 5. Cache and return
  this.cacheService.set(query, response);
  return response;
}

Step 3: Response Enhancement

All responses undergo post-processing:

Phone Number Replacement: {SUPPORT_PHONE} → actual support number
Markdown Formatting: Convert to user-friendly format
Safety Checks: Filter potentially harmful content
Context Preservation: Store for follow-up questions

Cost Optimization Strategies

1. FAQ-First Approach

Impact: 70-80% queries handled without AI
Cost: $0 (no API calls)
Speed: < 10ms response time

2. Gemini 1.5 Flash Model

Cost: $0.075 per 1M input tokens
Speed: Fast inference times
Quality: Excellent for IT support tasks

3. Smart Caching

Cache Hit Rate: ~30% for AI queries
Savings: Reduces API calls significantly
TTL: 1-hour expiration for freshness

4. Context Pruning

private pruneContext(context: string, maxTokens: number = 4000): string {
  // Remove redundant information
  // Prioritize most relevant content
  // Maintain coherence
}

Knowledge Base Structure

Categories and Context

const KNOWLEDGE_BASE = {
  'password': {
    context: 'Password policies, reset procedures...',
    commonIssues: ['locked accounts', 'expired passwords'],
    solutions: [...]
  },
  'email': {
    context: 'Email setup, troubleshooting...',
    commonIssues: ['sync issues', 'quota exceeded'],
    solutions: [...]
  }
  // ... more categories
}

Dynamic Knowledge Loading

The system intelligently loads only relevant knowledge based on query categorization:

Primary Category: Full context (1000-2000 tokens)
Related Categories: Summary only (200-500 tokens)
General Policies: Always included (500 tokens)

Integration with Ticket System

Seamless Escalation

When users need human support:

interface TicketContext {
  originalQuestion: string;
  aiSuggestions: string[];
  conversationHistory: Message[];
  category: string;
  attemptedSolutions: string[];
}

This context is automatically included in new tickets, giving support staff complete background.

Real-time FAQ Management

Firestore Integration

// FAQs are now stored in Firestore
/faqs/{faqId}
  - category: string
  - questions: string[]
  - answer: string
  - keywords: string[]
  - priority: number
  - usage_count: number

Admin Features

CRUD Operations: Full FAQ management UI
Usage Analytics: Track effectiveness
A/B Testing: Compare different answers
Bulk Import: Migrate legacy FAQs

Security and Privacy

Data Handling

No PII Storage: Conversations are ephemeral
Sanitization: User inputs are cleaned
Access Control: Firestore rules enforce permissions
Audit Trail: Admin actions are logged

Safety Measures

private sanitizeInput(input: string): string {
  // Remove potential XSS
  // Filter profanity
  // Validate length
  // Check for patterns
}

Performance Metrics

Response Times

FAQ Match: < 10ms
Cached AI: < 50ms
Fresh AI Query: 1-3 seconds

Accuracy Metrics

FAQ Hit Rate: 70-80%
AI Satisfaction: 85%+
Escalation Rate: < 15%

Configuration

Environment Variables

VITE_GEMINI_API_KEY=your_api_key
VITE_ENABLE_AI_SELF_HELP=true
VITE_ENABLE_TOKEN_CACHE=true
VITE_MAX_CONTEXT_SIZE=4000

Dynamic Settings

Support phone number
Company-specific knowledge
Response templates
Escalation thresholds

Monitoring and Analytics

Key Metrics Tracked

Query Volume: Requests per hour/day
Response Types: FAQ vs AI vs Escalation
Token Usage: Cost monitoring
User Satisfaction: Implicit feedback
Category Distribution: Common problem areas

Dashboard Integration

interface AIAnalytics {
  totalQueries: number;
  faqHitRate: number;
  aiUsageRate: number;
  escalationRate: number;
  tokenCost: number;
  categoryCounts: Record<string, number>;
}

Future Enhancements

Planned Features

Multi-language Support: Serve global teams
Voice Integration: Speech-to-text queries
Proactive Suggestions: Predict issues
Learning System: Improve from interactions
Integration APIs: Connect with other tools

Machine Learning Pipeline

Analyze successful resolutions
Identify new FAQ candidates
Improve categorization accuracy
Personalize responses per user

Best Practices

For Administrators

Regular FAQ Updates: Keep content current
Monitor Analytics: Identify gaps
Test Responses: Ensure accuracy
Gather Feedback: Improve continuously

For Developers

Token Efficiency: Minimize context size
Error Handling: Graceful fallbacks
Performance: Cache aggressively
Security: Validate all inputs

Troubleshooting

Common Issues

High Token Usage
- Check context size
- Review caching effectiveness
- Analyze query patterns
Slow Responses
- Verify Gemini API status
- Check network latency
- Review context loading
Poor Match Quality
- Update FAQ keywords
- Adjust confidence thresholds
- Enhance categorization

Conclusion

The AI-powered self-help system represents a sophisticated approach to IT support automation. By combining intelligent FAQ matching, advanced AI capabilities, and seamless escalation paths, it provides users with instant, accurate help while maintaining cost efficiency and system performance.

The architecture is designed to scale with organizational needs while maintaining flexibility for future enhancements and integrations.