Building NESO's AI-First Future: How the AI Workbench is Transforming the Way Teams Work

PISR: Problem, Impact, Solution, Result

Problem: NESO, Britain's National Energy System Operator managing the electricity grid for 60+ million people, wanted to become an AI-first organisation but had no AI platform capabilities. Teams navigated thousands of policy documents, regional energy pathways, and EV demand forecasts daily. Traditional search couldn't handle nuanced queries like "What are the EV charging infrastructure requirements for the North West region?"
Business Impact: Analysts spent hours manually searching PDFs and spreadsheets to answer stakeholder questions. Data was locked away, accessible only to technical specialists. New starters had no intelligent onboarding support. The energy transition depends on rapid, evidence-based decision-making - slow information access directly impacts grid planning and policy development.
Our Solution: ClearRoute designed and built an agentic AI platform - a multi-agent system where specialised AI agents collaborate to answer complex queries. Navi handles policy questions, news, and content generation. TReSP specialises in structured data analysis and visualisation. The platform generates podcasts from documents, creates images and videos, and renders interactive heat maps on geospatial tiles.
Tangible Result: We delivered 7 backend services and 2 AI agents in 6 months, integrating 20+ Azure services across 2 regions. The platform now serves users in alpha with instant policy answers, AI-generated podcasts, and self-service data visualisation - establishing NESO's foundation for AI-first operations.

The Challenge

The AI-First Ambition

NESO recognised that AI would fundamentally change how organisations operate. As Britain's energy system operator - responsible for keeping the lights on - they needed to be at the forefront of this transformation. The ambition: become an AI-first organisation where employees naturally turn to AI assistants to work more effectively.

But they were starting from zero. No AI platform. No ML capabilities. No unified way to access organisational knowledge.

How Work Happened Before

Challenge	Business Cost
Manual document search	Hours per query, inconsistent answers
Data locked in spreadsheets	Analysts as bottleneck for insights
No self-service capability	Stakeholders waiting for reports
Knowledge silos	Critical information hard to surface
Accessibility gap	Complex data inaccessible to non-technical users

Traditional search tools couldn't handle the nuanced queries NESO teams needed: "What are the EV charging infrastructure requirements for the North West region under the leading pathway scenario?" or "Summarise recent Ofgem announcements about grid connection reform."

Solution Overview

Meet Navi: Your Intelligent Work Companion

Navi (the Navigator Agent) is the centrepiece of the AI Workbench - an intelligent assistant that fundamentally changes how NESO employees work.

It Knows Who You Are

When you log into the Workbench, Navi already knows your role from Microsoft Graph. A new starter sees different prompts than a policy analyst. Your experience is personalised from day one.

It Answers Your Questions Instantly

Ask Navi "What are the policies if I want to take my laptop to France?" and it searches across all indexed policy documents using hybrid RAG (vector embeddings + BM25 keyword matching + semantic reranking), finds the relevant information, and gives you a clear answer with source citations.

It Keeps You Informed

Every morning, Navi surfaces the latest energy sector news - scraped from Ofgem, Gov.uk, and industry sources via Bing Custom Search. You start your day knowing what's happening in your industry.

It Makes Content Accessible

Got a 50-page technical document you need to understand but don't have time to read? Upload it to Navi and it generates an engaging podcast - not just narration, but a proper dual-voice conversation that makes dense content digestible. Architecture diagrams? Navi uses Document Intelligence (OCR) and GPT-4o to understand and explain them.

It Creates Visual Content

Need an image for a presentation? Describe it and Navi generates it via gpt-image-1. Need a video summary? Sora-2 creates 12-second clips from text prompts. Per-user quotas (10 videos/day, 500/month) ensure fair usage.

The TReSP Agent: Self-Service Data Analysis

For specialist grid data needs, the TReSP agent provides self-service analytics:

Natural Language to SQL: Ask "What are the EV demand statistics for Birmingham?" and TReSP generates SQL against DuckDB, executes on 37MB+ of EV demand projections, and explains the results
Heat Map Generation: Regional data visualised on interactive maps using MapTiler TileServer GL with 1.5GB of OpenStreetMap vector tiles for Great Britain
Geospatial Queries: GSP region lookup by coordinates for precise grid planning

Multi-Agent Architecture

User Query
    ↓
┌─────────────────┐
│ Intent Classifier│  ← Determines query type & routes
└────────┬────────┘
         ↓
┌────────┴────────┐
↓                 ↓
┌──────────────┐  ┌──────────────┐
│  NAVI AGENT  │  │  TReSP AGENT │
│              │  │              │
│ • Policy RAG │  │ • EV Demand  │
│ • Bing News  │  │ • Grid Data  │
│ • Images     │  │ • SQL Gen    │
│ • Video      │  │ • Heatmaps   │
│ • Podcasts   │  │ • Maps       │
└──────────────┘  └──────────────┘

LangGraph orchestrates query classification and routing. Seamless handoff between agents based on query intent - users don't need to know which agent handles their request.

Platform Services

Service	Purpose
navi-agent	Policy Q&A via RAG, Bing news, image generation (gpt-image-1), video generation (Sora-2), real-time notifications
tresp-api	EV demand queries, regional energy pathways, SQL generation, heatmap visualisation
podcast-api	PDF upload, script generation, dual-voice audio synthesis
nesoai-blob-gateway	Secure media proxy for authenticated streaming
nesoai-tileserver	OpenStreetMap vector tiles for Great Britain (1.5GB MBTiles)
nesoai-analytics	Platform telemetry, user events, metrics
nesoai-infrastructure	Terraform IaC across dev/qa/prod

Engagement Approach

Phase 1 (Months 1-2): Foundation Established Azure infrastructure, implemented vector database indexing, delivered initial Navi capability for document chat. Critically, we developed against ClearRoute's Azure environment first - enabling rapid iteration before tackling NESO's private endpoint complexity.

Phase 2 (Months 3-4): Expansion Added TReSP agent for grid data, Bing search integration for news feeds, and podcast generation. Each capability expanded what Navi could help employees with.

Phase 3 (Months 5-6): Scale Expanded team to 10+ engineers, onboarded dedicated QA, implemented PromptFoo testing framework, and prepared for broader rollout.

First Value: October 6th (Week 14) Stakeholders could interact with policy documents via natural language and see grid data visualised as heat maps.

Technical Implementation

RAG Pipeline

Documents chunked and embedded (text-embedding-ada-002, 1536 dimensions)
Indexed in Azure AI Search with vector + semantic capabilities
Query-time: hybrid retrieval (vector + BM25 + semantic reranking) with configurable thresholds
Results passed to GPT-4o with domain-specific system prompts
Source citations returned with every answer

Podcast Generation Pipeline

PDF Upload → Document Intelligence (OCR) → Script Generation (GPT-4o)
    → SSML Markup → Azure Speech (dual voice) → Blob Storage
    → Web PubSub notification → Client playback

Security & Enterprise Integration

Aspect	Implementation
Authentication	Microsoft Entra ID with OIDC/OAuth2 + PKCE
Authorisation	Role-based access via JWT claims, per-user data isolation
Network	Private endpoints for ALL Azure services, VNet integration
Identity	User-assigned managed identities for service-to-service auth
Secrets	Azure Key Vault with soft-delete and purge protection
Local Auth	Disabled on all cognitive services (Entra-only)

Multi-Region Architecture

Region	Purpose
UK South (Primary)	All web apps, Cosmos DB, AI Search, Storage, monitoring
Sweden Central	AI Foundry agents (Bing search), Sora-2 video generation, gpt-image-1

VNet peering connects Sweden to UK South for private connectivity - unlocking AI capabilities not yet available in UK regions.

Tech Stack

Layer	Technologies
Frontend	React 19, TypeScript, Vite, Zustand, Tailwind, Radix UI
Backend	Python 3.11-3.13, FastAPI, Flask, LangGraph, LangChain, DuckDB
AI/ML	Azure OpenAI (GPT-4o, text-embedding-ada-002), AI Foundry, AI Search, Speech Services, Document Intelligence, Sora-2, gpt-image-1
Data	Cosmos DB (MongoDB API), Blob Storage, Table Storage, Queue Storage
Geospatial	MapTiler TileServer GL, MBTiles, Leaflet
Infrastructure	Terraform (15 modules), Azure Pipelines, Docker, App Service, Functions
Security	Entra ID, OIDC/OAuth2, JWT, Managed Identities, Private Endpoints, Key Vault

The Results

Platform Delivered

Metric	Achievement
Backend Services	7 APIs + 1 Function App
AI Agents	2 (Navi + TReSP)
AI Models Deployed	4 (GPT-4o, ada-002, gpt-image-1, Sora-2)
Azure Services Integrated	20+
Terraform Modules	15 reusable modules
Environments	3 (Dev, QA, Prod)
Regions	2 (UK South, Sweden Central)
Private Endpoints	15+
Weekly Active Users	~20 (alpha)
Daily Chat Volume	~35 chats/day

How Work Is Changing

Policy Questions: Hours → Seconds Previously, finding the right policy meant searching through multiple systems. Now employees ask Navi and get immediate, accurate answers with source citations.

Content Consumption: Reading → Listening Dense technical documents become engaging podcasts employees can listen to during their commute. Dual-voice narration makes complex content accessible.

Grid Data: Request & Wait → Self-Service Business users who needed EV demand data had to raise requests and wait. Now they ask TReSP and get visualisations immediately on interactive maps.

New Starter Experience: Lost → Guided First day at NESO? Navi knows your role and provides personalised prompts based on your position.

Value by Stakeholder

For NESO Leadership

Established AI platform capability from zero - foundation for AI-first operations
Zero-trust security model from day one (private endpoints, managed identity, no local auth)
Multi-region deployment unlocking cutting-edge AI capabilities (Sora-2, gpt-image-1)

For Engineering

15 reusable Terraform modules for rapid environment provisioning
Modern tech stack (React 19, Python 3.12, LangGraph) that attracts talent
Patterns for multi-agent AI development replicable across future use cases

For Every Employee

Instant answers to policy questions with source citations
Complex documents made accessible through podcasts
Self-service data visualisation without technical skills

Lessons Learned

What Worked Well

LangGraph for Agentic Workflows Flexible state management and clear debugging. The directed graph model made it easy to reason about agent capabilities and add new ones.

Hybrid RAG (Vector + Semantic + BM25) Significantly better retrieval than pure vector search. Configurable relevance thresholds per domain enabled fine-tuning for different document types.

Develop Local, Deploy Remote Building against ClearRoute Azure first, then porting to NESO, enabled rapid iteration. NESO's private endpoint requirements would have slowed feature development significantly.

Multi-Region AI Foundry Unlocked Sora-2 and Bing agents not available in UK South. VNet peering maintains private connectivity across regions.

Private Endpoints Everywhere Enterprise security from day one. Zero-trust architecture with managed identities eliminated credential management overhead.

Challenges Overcome

Aggressive Timeline The October 6th demo required trade-offs. The team subsequently invested in hardening - adding dedicated QA, implementing PromptFoo for LLM testing, and building proper integration tests.

Late Responsible AI Involvement Three months in, the Responsible AI lead surfaced compliance requirements that should have been considered earlier. Lesson: engage risk and compliance stakeholders from day one.

Emergent Requirements Requirements arrived as "Can you do podcasts? Make it happen." Stakeholders didn't know what they wanted until they saw it. We adopted iterative demonstration rather than extensive upfront specification.

Replicable Patterns

Multi-agent routing architecture with LangGraph
Enterprise RAG pipeline on Azure AI Search (hybrid retrieval)
Real-time notification pattern for async AI tasks (Web PubSub)
Secure media gateway with Azure AD integration
Terraform modules for Azure AI services stack
Multi-region AI deployment pattern with VNet peering

Future State

The AI Workbench is positioned for organisation-wide rollout. Planned capabilities:

Holiday Booking: Ask Navi to check your balance and book time off, triggering approval workflows automatically
Admin Portal: Self-service configuration for data sources and user group permissions
Runtime Quality Monitoring: Real-time evaluation of agent responses beyond user feedback
Expanded Agent Ecosystem: New agents for other business domains following established patterns

Navi is just the beginning. The platform establishes NESO's foundation for AI-first operations - where AI assistants aren't a novelty but an integral part of how every employee works. The architecture and patterns are ready to scale across 4,000-5,000 employees, transforming NESO into the AI-enabled organisation it set out to become.

Internal Learning

This engagement demonstrates that:

Agentic AI is production-ready - LangGraph provides the orchestration layer needed for complex multi-agent systems in enterprise environments
Hybrid RAG outperforms pure vector - combining embeddings, BM25, and semantic reranking delivers significantly better retrieval
Multi-region unlocks capabilities - AI features ship to different regions at different times; architecture should accommodate this
Security doesn't slow you down - private endpoints and managed identities from day one actually simplified the architecture
Small teams ship fast - 5 core engineers delivered a working platform in 14 weeks; scale once patterns are established