Graph-Support Chatbot Deployment — With Monitoring

This document describes the infrastructure deployment of the Graph-Support Chatbot, an off-the-shelf system enabling semantic search and conversational access to a generated knowledge graph, hosted on OVH public cloud using Managed Kubernetes.

Architecture

Four node pools on a single Managed Kubernetes Cluster, fronted by an Nginx ingress.

Pool	Flavour	Capacities	Workloads
General Purpose	b3-16	4 vCPUs - 16Gb RAM	Master chat agent, MCP server
Memory Optimized	r3-32	4 vCPUs - 32Gb RAM	Graph support search
Admin	b3-16	4 vCPUs - 16Gb RAM	Graph generation server, Playwright MCP
Monitoring	b2-7	2 vCPUs - 7Gb RAM	MLFlow, Grafana, Prometheus

Services

Master chat agent — public-facing conversational interface (/chat). Also exposes session and chat management endpoints (session creation, history retrieval, conversation lifecycle). Backed by a Postgres PVC (50 GB) used to persist conversations, session state, and chat history. Communicates with all other services via the MCP hub.

MCP server — internal hub routing calls between the master agent, graph support search and the external client API and tools.

Graph support search — internal semantic search over the knowledge graph. Reads directly from the S3 bucket (OVH Object Storage, outside the cluster).

Graph generation server — admin-only (/graph-gen). Analyses source code via Playwright MCP (headless Chromium) and writes the resulting knowledge graph directly to the S3 bucket.

Playwright MCP — separate pod on the Admin node. Runs headless Chromium for code crawling and page analysis (~300 MB RAM per browser instance).

Monitoring stack — admin-only observability layer. Runs MLFlow for AI/model monitoring (tracking agent performance, prompt evaluation, model metrics) and Grafana + Prometheus for general infrastructure and application monitoring (CPU, memory, request rates, error rates). Scrapes metrics from the Master chat agent, Graph generation server, and Graph support search. Backed by a PVC for persisting metrics, dashboards, and MLFlow experiment data.

graph TB

  USERS(["Users"])
  ADMIN(["Admin"])
  CLIENTAPI(["Client API<br>(External)"])
  S3[("S3 Bucket<br>(OVH Object Storage)")]

  subgraph OVH["OVH Public Cloud — MKS Cluster"]

    INGRESS["Ingress<br>Nginx / OVH Load Balancer"]

    subgraph POOL_GP["NodePool (4vCPU - 16Gb RAM)"]
      CHAT["Master Chat Agent<br>(Exposed)"]
      MCP["MCP<br>(Internal)"]
      PG[("Postgres Db<br>(Internal - PVC)")]
    end

    subgraph POOL_MEM["NodePool (4vCPU - 32Gb RAM)"]
      GSS["Graph Support Search<br>(Internal)"]
    end

    subgraph POOL_ADMIN["NodePool (4vCPU - 16Gb RAM)"]
      GGS["Graph Generation<br>(Exposed - Admin Only)"]
      PWMCP["Playwright MCP"]
    end

    subgraph POOL_MON["NodePool (2vCPU - 7Gb RAM)"]
      MLF["MLFlow<br>(Exposed - Admin Only)"]
      GRAF["Grafana + Prometheus<br>(Exposed - Admin Only)"]
      MONPV[("Monitoring Db<br>(Internal - PVC)")]
    end

  end

  USERS -->|"HTTPS"| INGRESS
  ADMIN -->|"HTTPS · restricted"| INGRESS
  INGRESS -->|"/chat"| CHAT
  INGRESS -->|"/graph-gen 🔒"| GGS
  INGRESS -->|"/mlflow 🔒"| MLF
  INGRESS -->|"/grafana 🔒"| GRAF

  CHAT <-->|"orchestration"| MCP
  CHAT --- PG

  MCP <-->|"query"| GSS
  MCP <-->|"external"| CLIENTAPI

  GSS -->|"read"| S3

  GGS <-->|"browser jobs"| PWMCP
  GGS -->|"graph write"| S3

  GRAF -.->|"scrape metrics"| CHAT
  GRAF -.->|"scrape metrics"| GSS
  GRAF -.->|"scrape metrics"| GGS
  MLF -.->|"AI traces"| CHAT
  MLF -.->|"AI traces"| GGS
  MLF -.->|"AI traces"| GSS
  GRAF --- MONPV
  MLF --- MONPV

  classDef exposed   fill:#e6f1fb,stroke:#185fa5,color:#0c447c
  classDef internal  fill:#e1f5ee,stroke:#0f6e56,color:#085041
  classDef storage   fill:#faeeda,stroke:#854f0b,color:#633806
  classDef ingress   fill:#eeedfe,stroke:#534ab7,color:#3c3489
  classDef ext       fill:#f1efe8,stroke:#5f5e5a,color:#444441
  classDef browser   fill:#faece7,stroke:#993c1d,color:#712b13

  class CHAT,GGS,MLF,GRAF exposed
  class MCP,GSS,PWMCP internal
  class PG,S3,MONPV storage
  class INGRESS ingress
  class USERS,ADMIN,CLIENTAPI ext
  class PWMCP browser

Data flows

User query → Ingress → Master agent → MCP → Graph support search → S3 bucket
Graph generation → Ingress (admin) → Graph generation server → Playwright MCP → S3 bucket (direct write)
External integration → Client API ↔ MCP hub

Storage

Volume	Type	Size	Consumer
Postgres	OVH Block (Cinder)	50 GB	Master chat agent
Monitoring	OVH Block (Cinder)	TBD	MLFlow + Grafana/Prometheus
S3 Bucket	OVH Object Storage	—	Graph support search + Graph generation server

Access control

/chat — public via Ingress