Architecture — LVG Platform

Platform Overview

Hub-and-spoke architecture with a central management cluster orchestrating isolated customer environments

Cluster Management

GitOps

Monitoring

Identity

Infrastructure

Technology Stack

Every component is open-source. No vendor lock-in. Full control.

Container Orchestration

K8

RKE2

FIPS-compliant Kubernetes distribution by Rancher. Powers every cluster with hardened defaults and CIS benchmarks.

Kubernetes

R

Rancher

Multi-cluster management UI. Provides a single pane of glass for provisioning, monitoring, and managing all customer clusters.

Management

A

ArgoCD

Declarative GitOps continuous delivery. All deployments are driven by Git commits, ensuring auditability and rollback capability.

GitOps

Networking & Security

C

Cilium + Hubble

eBPF-powered CNI for high-performance networking with deep traffic visibility. Enforces network policies for tenant isolation.

Networking

K

Kyverno

Kubernetes-native policy engine. Validates and enforces security policies, pod standards, and organizational rules at admission time.

Policy

F

Falco

Runtime security monitoring. Detects anomalous activity in containers and Kubernetes clusters using system call analysis.

Security

ID

Keycloak

Enterprise identity and access management. Provides SSO, OIDC/SAML integration with customer identity providers.

Identity

Observability

P

Prometheus

Industry-standard metrics collection and alerting. Scrapes node, pod, and application metrics across all clusters.

Metrics

G

Grafana

Visualization platform with pre-built dashboards for cluster health, resource utilization, and application performance.

Dashboards

L

Loki

Lightweight log aggregation system. Efficiently indexes and queries logs from all pods and system components.

Logging

A

Alertmanager

Alert routing, deduplication, and notification. Routes critical alerts to PagerDuty, Slack, and on-call engineers.

Alerting

Infrastructure & Operations

T

Terraform

Infrastructure as Code for reproducible, version-controlled provisioning of servers, networks, and storage on Hetzner Cloud.

IaC

V

Velero

Kubernetes-native backup and disaster recovery. Daily cluster snapshots with 30-day retention to Hetzner StorageBox.

Backup

CM

cert-manager

Automated TLS certificate management. Issues and renews certificates via Let's Encrypt for all customer domains.

Certificates

H

Hetzner Cloud

German cloud provider with excellent price-performance. GDPR-compliant data centers ensure data sovereignty for EU customers.

Infrastructure

Single Customer Architecture

Each customer receives a fully isolated Kubernetes environment with dedicated infrastructure (hard multi-tenancy)

Isolation Guarantees

1

Compute Isolation

Dedicated servers per customer. No shared worker nodes, no noisy-neighbor effects. Full CPU and memory allocation.

2

Network Isolation

Cilium enforces strict network policies. Each cluster has its own network space. No cross-customer traffic is possible.

3

Data Isolation

Separate persistent volumes and backup storage. Customer data never co-mingles. Encrypted at rest and in transit.

4

Access Isolation

Per-customer RBAC and SSO. Customers authenticate through their own identity provider via Keycloak OIDC/SAML.

Customer Lifecycle

From contract signing to production workloads in 3-5 business days

1

Infrastructure Provisioning

Terraform provisions servers, networking, load balancers, and storage on Hetzner Cloud. Fully automated, 30-60 minutes.

2

Kubernetes Bootstrap

RKE2 initializes the cluster with an HA control plane (3 etcd members). Rancher registers the cluster for centralized management. 15-30 min.

3

Platform Stack Deployment

ArgoCD deploys the full platform stack: Cilium, Kyverno, Falco, Prometheus, Grafana, Loki, cert-manager, and Velero. 15-30 min.

4

Security & Identity Configuration

RBAC roles, SSO integration with customer IdP, network policies, and pod security standards are configured and validated.

5

Monitoring & Backup Setup

Grafana dashboards, alert rules, and Velero backup schedules (daily, 30-day retention) are configured. Customer gets read-only dashboard access.

6

Validation & Handover

Full connectivity check, backup restore test, monitoring verification. Customer receives access credentials and training session.

Ongoing Operations

Continuous monitoring, automated updates, and proactive incident management

24

24/7 Monitoring

Prometheus, Loki, and Falco continuously monitor infrastructure, applications, and security. Alerts route to PagerDuty for immediate response.

0

Zero-Downtime Updates

Rolling Kubernetes upgrades, automated OS patches, and GitOps-driven component updates. No maintenance windows required.

DR

Disaster Recovery

Daily Velero backups, 6-hourly etcd snapshots. Full cluster restore in under 4 hours. Quarterly DR testing for validation.

SLA

SLA Guarantees

Up to 99.95% uptime for Enterprise tier. 15-minute P1 response time. Financial credits for SLA breaches.

Architecture & Tech Stack