Back to Case Studies
Energy Infrastructure

Building NESO's AI-First Future: How the AI Workbench is Transforming the Way Teams Work

We built NESO's AI Workbench platform from scratch in 6 months, creating Navi and TReSP agents with podcast, image, and video generation capabilities that transform how 4,000+ employees work - establishing the foundation for NESO becoming an AI-first organisation.

Published on: July 1, 2025Last Updated: December 16, 202510 min read
Building NESO's AI-First Future: How the AI Workbench is Transforming the Way Teams Work

PISR: Problem, Impact, Solution, Result

  • Problem: NESO, Britain's National Energy System Operator managing the electricity grid for 60+ million people, wanted to become an AI-first organisation but had no AI platform capabilities. Teams navigated thousands of policy documents, regional energy pathways, and EV demand forecasts daily. Traditional search couldn't handle nuanced queries like "What are the EV charging infrastructure requirements for the North West region?"

  • Business Impact: Analysts spent hours manually searching PDFs and spreadsheets to answer stakeholder questions. Data was locked away, accessible only to technical specialists. New starters had no intelligent onboarding support. The energy transition depends on rapid, evidence-based decision-making - slow information access directly impacts grid planning and policy development.

  • Our Solution: ClearRoute designed and built an agentic AI platform - a multi-agent system where specialised AI agents collaborate to answer complex queries. Navi handles policy questions, news, and content generation. TReSP specialises in structured data analysis and visualisation. The platform generates podcasts from documents, creates images and videos, and renders interactive heat maps on geospatial tiles.

  • Tangible Result: We delivered 7 backend services and 2 AI agents in 6 months, integrating 20+ Azure services across 2 regions. The platform now serves users in alpha with instant policy answers, AI-generated podcasts, and self-service data visualisation - establishing NESO's foundation for AI-first operations.


The Challenge

The AI-First Ambition

NESO recognised that AI would fundamentally change how organisations operate. As Britain's energy system operator - responsible for keeping the lights on - they needed to be at the forefront of this transformation. The ambition: become an AI-first organisation where employees naturally turn to AI assistants to work more effectively.

But they were starting from zero. No AI platform. No ML capabilities. No unified way to access organisational knowledge.

How Work Happened Before

ChallengeBusiness Cost
Manual document searchHours per query, inconsistent answers
Data locked in spreadsheetsAnalysts as bottleneck for insights
No self-service capabilityStakeholders waiting for reports
Knowledge silosCritical information hard to surface
Accessibility gapComplex data inaccessible to non-technical users

Traditional search tools couldn't handle the nuanced queries NESO teams needed: "What are the EV charging infrastructure requirements for the North West region under the leading pathway scenario?" or "Summarise recent Ofgem announcements about grid connection reform."


Solution Overview

Meet Navi: Your Intelligent Work Companion

Navi (the Navigator Agent) is the centrepiece of the AI Workbench - an intelligent assistant that fundamentally changes how NESO employees work.

It Knows Who You Are

When you log into the Workbench, Navi already knows your role from Microsoft Graph. A new starter sees different prompts than a policy analyst. Your experience is personalised from day one.

It Answers Your Questions Instantly

Ask Navi "What are the policies if I want to take my laptop to France?" and it searches across all indexed policy documents using hybrid RAG (vector embeddings + BM25 keyword matching + semantic reranking), finds the relevant information, and gives you a clear answer with source citations.

It Keeps You Informed

Every morning, Navi surfaces the latest energy sector news - scraped from Ofgem, Gov.uk, and industry sources via Bing Custom Search. You start your day knowing what's happening in your industry.

It Makes Content Accessible

Got a 50-page technical document you need to understand but don't have time to read? Upload it to Navi and it generates an engaging podcast - not just narration, but a proper dual-voice conversation that makes dense content digestible. Architecture diagrams? Navi uses Document Intelligence (OCR) and GPT-4o to understand and explain them.

It Creates Visual Content

Need an image for a presentation? Describe it and Navi generates it via gpt-image-1. Need a video summary? Sora-2 creates 12-second clips from text prompts. Per-user quotas (10 videos/day, 500/month) ensure fair usage.

The TReSP Agent: Self-Service Data Analysis

For specialist grid data needs, the TReSP agent provides self-service analytics:

  • Natural Language to SQL: Ask "What are the EV demand statistics for Birmingham?" and TReSP generates SQL against DuckDB, executes on 37MB+ of EV demand projections, and explains the results
  • Heat Map Generation: Regional data visualised on interactive maps using MapTiler TileServer GL with 1.5GB of OpenStreetMap vector tiles for Great Britain
  • Geospatial Queries: GSP region lookup by coordinates for precise grid planning

Multi-Agent Architecture

User Query
    ↓
┌─────────────────┐
│ Intent Classifier│  ← Determines query type & routes
└────────┬────────┘
         ↓
┌────────┴────────┐
↓                 ↓
┌──────────────┐  ┌──────────────┐
│  NAVI AGENT  │  │  TReSP AGENT │
│              │  │              │
│ • Policy RAG │  │ • EV Demand  │
│ • Bing News  │  │ • Grid Data  │
│ • Images     │  │ • SQL Gen    │
│ • Video      │  │ • Heatmaps   │
│ • Podcasts   │  │ • Maps       │
└──────────────┘  └──────────────┘

LangGraph orchestrates query classification and routing. Seamless handoff between agents based on query intent - users don't need to know which agent handles their request.

Platform Services

ServicePurpose
navi-agentPolicy Q&A via RAG, Bing news, image generation (gpt-image-1), video generation (Sora-2), real-time notifications
tresp-apiEV demand queries, regional energy pathways, SQL generation, heatmap visualisation
podcast-apiPDF upload, script generation, dual-voice audio synthesis
nesoai-blob-gatewaySecure media proxy for authenticated streaming
nesoai-tileserverOpenStreetMap vector tiles for Great Britain (1.5GB MBTiles)
nesoai-analyticsPlatform telemetry, user events, metrics
nesoai-infrastructureTerraform IaC across dev/qa/prod

Engagement Approach

Phase 1 (Months 1-2): Foundation Established Azure infrastructure, implemented vector database indexing, delivered initial Navi capability for document chat. Critically, we developed against ClearRoute's Azure environment first - enabling rapid iteration before tackling NESO's private endpoint complexity.

Phase 2 (Months 3-4): Expansion Added TReSP agent for grid data, Bing search integration for news feeds, and podcast generation. Each capability expanded what Navi could help employees with.

Phase 3 (Months 5-6): Scale Expanded team to 10+ engineers, onboarded dedicated QA, implemented PromptFoo testing framework, and prepared for broader rollout.

First Value: October 6th (Week 14) Stakeholders could interact with policy documents via natural language and see grid data visualised as heat maps.


Technical Implementation

RAG Pipeline

  1. Documents chunked and embedded (text-embedding-ada-002, 1536 dimensions)
  2. Indexed in Azure AI Search with vector + semantic capabilities
  3. Query-time: hybrid retrieval (vector + BM25 + semantic reranking) with configurable thresholds
  4. Results passed to GPT-4o with domain-specific system prompts
  5. Source citations returned with every answer

Podcast Generation Pipeline

PDF Upload → Document Intelligence (OCR) → Script Generation (GPT-4o)
    → SSML Markup → Azure Speech (dual voice) → Blob Storage
    → Web PubSub notification → Client playback

Security & Enterprise Integration

AspectImplementation
AuthenticationMicrosoft Entra ID with OIDC/OAuth2 + PKCE
AuthorisationRole-based access via JWT claims, per-user data isolation
NetworkPrivate endpoints for ALL Azure services, VNet integration
IdentityUser-assigned managed identities for service-to-service auth
SecretsAzure Key Vault with soft-delete and purge protection
Local AuthDisabled on all cognitive services (Entra-only)

Multi-Region Architecture

RegionPurpose
UK South (Primary)All web apps, Cosmos DB, AI Search, Storage, monitoring
Sweden CentralAI Foundry agents (Bing search), Sora-2 video generation, gpt-image-1

VNet peering connects Sweden to UK South for private connectivity - unlocking AI capabilities not yet available in UK regions.

Tech Stack

LayerTechnologies
FrontendReact 19, TypeScript, Vite, Zustand, Tailwind, Radix UI
BackendPython 3.11-3.13, FastAPI, Flask, LangGraph, LangChain, DuckDB
AI/MLAzure OpenAI (GPT-4o, text-embedding-ada-002), AI Foundry, AI Search, Speech Services, Document Intelligence, Sora-2, gpt-image-1
DataCosmos DB (MongoDB API), Blob Storage, Table Storage, Queue Storage
GeospatialMapTiler TileServer GL, MBTiles, Leaflet
InfrastructureTerraform (15 modules), Azure Pipelines, Docker, App Service, Functions
SecurityEntra ID, OIDC/OAuth2, JWT, Managed Identities, Private Endpoints, Key Vault

The Results

Platform Delivered

MetricAchievement
Backend Services7 APIs + 1 Function App
AI Agents2 (Navi + TReSP)
AI Models Deployed4 (GPT-4o, ada-002, gpt-image-1, Sora-2)
Azure Services Integrated20+
Terraform Modules15 reusable modules
Environments3 (Dev, QA, Prod)
Regions2 (UK South, Sweden Central)
Private Endpoints15+
Weekly Active Users~20 (alpha)
Daily Chat Volume~35 chats/day

How Work Is Changing

Policy Questions: Hours → Seconds Previously, finding the right policy meant searching through multiple systems. Now employees ask Navi and get immediate, accurate answers with source citations.

Content Consumption: Reading → Listening Dense technical documents become engaging podcasts employees can listen to during their commute. Dual-voice narration makes complex content accessible.

Grid Data: Request & Wait → Self-Service Business users who needed EV demand data had to raise requests and wait. Now they ask TReSP and get visualisations immediately on interactive maps.

New Starter Experience: Lost → Guided First day at NESO? Navi knows your role and provides personalised prompts based on your position.

Value by Stakeholder

For NESO Leadership

  • Established AI platform capability from zero - foundation for AI-first operations
  • Zero-trust security model from day one (private endpoints, managed identity, no local auth)
  • Multi-region deployment unlocking cutting-edge AI capabilities (Sora-2, gpt-image-1)

For Engineering

  • 15 reusable Terraform modules for rapid environment provisioning
  • Modern tech stack (React 19, Python 3.12, LangGraph) that attracts talent
  • Patterns for multi-agent AI development replicable across future use cases

For Every Employee

  • Instant answers to policy questions with source citations
  • Complex documents made accessible through podcasts
  • Self-service data visualisation without technical skills

Lessons Learned

What Worked Well

LangGraph for Agentic Workflows Flexible state management and clear debugging. The directed graph model made it easy to reason about agent capabilities and add new ones.

Hybrid RAG (Vector + Semantic + BM25) Significantly better retrieval than pure vector search. Configurable relevance thresholds per domain enabled fine-tuning for different document types.

Develop Local, Deploy Remote Building against ClearRoute Azure first, then porting to NESO, enabled rapid iteration. NESO's private endpoint requirements would have slowed feature development significantly.

Multi-Region AI Foundry Unlocked Sora-2 and Bing agents not available in UK South. VNet peering maintains private connectivity across regions.

Private Endpoints Everywhere Enterprise security from day one. Zero-trust architecture with managed identities eliminated credential management overhead.

Challenges Overcome

Aggressive Timeline The October 6th demo required trade-offs. The team subsequently invested in hardening - adding dedicated QA, implementing PromptFoo for LLM testing, and building proper integration tests.

Late Responsible AI Involvement Three months in, the Responsible AI lead surfaced compliance requirements that should have been considered earlier. Lesson: engage risk and compliance stakeholders from day one.

Emergent Requirements Requirements arrived as "Can you do podcasts? Make it happen." Stakeholders didn't know what they wanted until they saw it. We adopted iterative demonstration rather than extensive upfront specification.

Replicable Patterns

  • Multi-agent routing architecture with LangGraph
  • Enterprise RAG pipeline on Azure AI Search (hybrid retrieval)
  • Real-time notification pattern for async AI tasks (Web PubSub)
  • Secure media gateway with Azure AD integration
  • Terraform modules for Azure AI services stack
  • Multi-region AI deployment pattern with VNet peering

Future State

The AI Workbench is positioned for organisation-wide rollout. Planned capabilities:

  • Holiday Booking: Ask Navi to check your balance and book time off, triggering approval workflows automatically
  • Admin Portal: Self-service configuration for data sources and user group permissions
  • Runtime Quality Monitoring: Real-time evaluation of agent responses beyond user feedback
  • Expanded Agent Ecosystem: New agents for other business domains following established patterns

Navi is just the beginning. The platform establishes NESO's foundation for AI-first operations - where AI assistants aren't a novelty but an integral part of how every employee works. The architecture and patterns are ready to scale across 4,000-5,000 employees, transforming NESO into the AI-enabled organisation it set out to become.


Internal Learning

This engagement demonstrates that:

  1. Agentic AI is production-ready - LangGraph provides the orchestration layer needed for complex multi-agent systems in enterprise environments
  2. Hybrid RAG outperforms pure vector - combining embeddings, BM25, and semantic reranking delivers significantly better retrieval
  3. Multi-region unlocks capabilities - AI features ship to different regions at different times; architecture should accommodate this
  4. Security doesn't slow you down - private endpoints and managed identities from day one actually simplified the architecture
  5. Small teams ship fast - 5 core engineers delivered a working platform in 14 weeks; scale once patterns are established