Architecture & Tech Stack
A complete overview of the LVG Platform architecture, technology choices, and how the system operates for each customer.
Platform Overview
Hub-and-spoke architecture with a central management cluster orchestrating isolated customer environments
Technology Stack
Every component is open-source. No vendor lock-in. Full control.
Container Orchestration
RKE2
FIPS-compliant Kubernetes distribution by Rancher. Powers every cluster with hardened defaults and CIS benchmarks.
KubernetesRancher
Multi-cluster management UI. Provides a single pane of glass for provisioning, monitoring, and managing all customer clusters.
ManagementArgoCD
Declarative GitOps continuous delivery. All deployments are driven by Git commits, ensuring auditability and rollback capability.
GitOpsNetworking & Security
Cilium + Hubble
eBPF-powered CNI for high-performance networking with deep traffic visibility. Enforces network policies for tenant isolation.
NetworkingKyverno
Kubernetes-native policy engine. Validates and enforces security policies, pod standards, and organizational rules at admission time.
PolicyFalco
Runtime security monitoring. Detects anomalous activity in containers and Kubernetes clusters using system call analysis.
SecurityKeycloak
Enterprise identity and access management. Provides SSO, OIDC/SAML integration with customer identity providers.
IdentityObservability
Prometheus
Industry-standard metrics collection and alerting. Scrapes node, pod, and application metrics across all clusters.
MetricsGrafana
Visualization platform with pre-built dashboards for cluster health, resource utilization, and application performance.
DashboardsLoki
Lightweight log aggregation system. Efficiently indexes and queries logs from all pods and system components.
LoggingAlertmanager
Alert routing, deduplication, and notification. Routes critical alerts to PagerDuty, Slack, and on-call engineers.
AlertingInfrastructure & Operations
Terraform
Infrastructure as Code for reproducible, version-controlled provisioning of servers, networks, and storage on Hetzner Cloud.
IaCVelero
Kubernetes-native backup and disaster recovery. Daily cluster snapshots with 30-day retention to Hetzner StorageBox.
Backupcert-manager
Automated TLS certificate management. Issues and renews certificates via Let's Encrypt for all customer domains.
CertificatesHetzner Cloud
German cloud provider with excellent price-performance. GDPR-compliant data centers ensure data sovereignty for EU customers.
InfrastructureSingle Customer Architecture
Each customer receives a fully isolated Kubernetes environment with dedicated infrastructure (hard multi-tenancy)
Isolation Guarantees
Compute Isolation
Dedicated servers per customer. No shared worker nodes, no noisy-neighbor effects. Full CPU and memory allocation.
Network Isolation
Cilium enforces strict network policies. Each cluster has its own network space. No cross-customer traffic is possible.
Data Isolation
Separate persistent volumes and backup storage. Customer data never co-mingles. Encrypted at rest and in transit.
Access Isolation
Per-customer RBAC and SSO. Customers authenticate through their own identity provider via Keycloak OIDC/SAML.
Customer Lifecycle
From contract signing to production workloads in 3-5 business days
Infrastructure Provisioning
Terraform provisions servers, networking, load balancers, and storage on Hetzner Cloud. Fully automated, 30-60 minutes.
Kubernetes Bootstrap
RKE2 initializes the cluster with an HA control plane (3 etcd members). Rancher registers the cluster for centralized management. 15-30 min.
Platform Stack Deployment
ArgoCD deploys the full platform stack: Cilium, Kyverno, Falco, Prometheus, Grafana, Loki, cert-manager, and Velero. 15-30 min.
Security & Identity Configuration
RBAC roles, SSO integration with customer IdP, network policies, and pod security standards are configured and validated.
Monitoring & Backup Setup
Grafana dashboards, alert rules, and Velero backup schedules (daily, 30-day retention) are configured. Customer gets read-only dashboard access.
Validation & Handover
Full connectivity check, backup restore test, monitoring verification. Customer receives access credentials and training session.
Ongoing Operations
Continuous monitoring, automated updates, and proactive incident management
24/7 Monitoring
Prometheus, Loki, and Falco continuously monitor infrastructure, applications, and security. Alerts route to PagerDuty for immediate response.
Zero-Downtime Updates
Rolling Kubernetes upgrades, automated OS patches, and GitOps-driven component updates. No maintenance windows required.
Disaster Recovery
Daily Velero backups, 6-hourly etcd snapshots. Full cluster restore in under 4 hours. Quarterly DR testing for validation.
SLA Guarantees
Up to 99.95% uptime for Enterprise tier. 15-minute P1 response time. Financial credits for SLA breaches.