Kubewekend - The K8s Playground

Kubewekend CLI

Unified Bash CLI to setup and operate Kind / K3s Kubernetes clusters for workshops, demos, and local experiments.


Table of Contents


Prerequisites

Tool Required Purpose
vagrant Yes* VM provisioning
VirtualBox Yes* VM provider
ansible Yes Cluster orchestration
kubectl Yes Kubernetes CLI
helm Yes Helm chart management
docker Optional Required for Kind clusters
kind Optional Kind cluster binary (installed by playbook)
jq Optional JSON processing

* Vagrant + VirtualBox are only required for local VM workflows. For remote VPS, only ansible + SSH are needed.

Verify with:

./scripts/setup.sh env check

Getting Started

# 1. Make the script executable
chmod +x ./scripts/setup.sh

# 2. Initialize .env from template
./scripts/setup.sh env init
# Edit .env with your SSH_USER and SSH_KEY

# 3. Check tools
./scripts/setup.sh env check

# 4. Follow a quickstart guide
./scripts/setup.sh quickstart kind-vagrant
./scripts/setup.sh quickstart k3s-vagrant
./scripts/setup.sh quickstart k3s-remote

CLI Reference

./scripts/setup.sh <command> [subcommand] [options]
./scripts/setup.sh help                        # global help
./scripts/setup.sh <command> help              # per-command help

env — Environment Management

Subcommand Description
env check Verify all required/optional tools are installed
env init Create .env from template.env
env show Print current environment variables
./scripts/setup.sh env check
./scripts/setup.sh env init
./scripts/setup.sh env show

vagrant — VM Lifecycle

Subcommand Description
vagrant up [machines...] Provision VMs (default: k8s-master-machine)
vagrant halt [machines...] Stop VMs
vagrant destroy [machines...] Destroy VMs (with confirmation)
vagrant status Show VM status
vagrant ssh <machine> SSH into a VM
vagrant reload [machines...] Reload VMs
# Provision master
./scripts/setup.sh vagrant up k8s-master-machine

# Provision master + 2 workers
./scripts/setup.sh vagrant up k8s-master-machine k8s-worker-machine-1 k8s-worker-machine-2

# Provision workers by regex
./scripts/setup.sh vagrant up "/k8s-worker-machine-[1-2]/"

# SSH into master
./scripts/setup.sh vagrant ssh k8s-master-machine

# Halt all
./scripts/setup.sh vagrant halt

inventory — Ansible Inventory

Subcommand Description
inventory generate Auto-generate inventory from running Vagrant VMs
inventory show Display current inventory file
inventory ping [group] Test SSH connectivity (defaults to all)
inventory set-remote Interactive wizard to configure remote VPS inventory
# Auto-generate from vagrant
./scripts/setup.sh inventory generate

# Test connectivity to all hosts
./scripts/setup.sh inventory ping

# Test only standalone masters
./scripts/setup.sh inventory ping standalone-masters

# Setup remote VPS inventory (interactive)
./scripts/setup.sh inventory set-remote

kind — Kind Cluster Operations

Subcommand Description
kind setup Install tools + create Kind cluster + configure networking
kind destroy Remove Kind cluster and related components
kind utils <tags...> Install K8s utilities by tag
# Full setup
./scripts/setup.sh kind setup

# Setup targeting a specific host
./scripts/setup.sh kind setup --host k8s-master-machine

# Preview without executing
./scripts/setup.sh kind setup --dry-run

# Destroy
./scripts/setup.sh kind destroy

# Install utilities
./scripts/setup.sh kind utils certmanager dashboard
./scripts/setup.sh kind utils ingress_test apigateway_test

k3s — K3s Cluster Operations

Subcommand Description
k3s setup Setup standalone K3s cluster (1 master + N workers)
k3s ha-setup Setup HA K3s cluster (3+ masters + N workers)
k3s destroy Uninstall K3s from all inventory nodes
k3s utils <tags...> Install K8s utilities by tag
# Standalone setup
./scripts/setup.sh k3s setup

# HA setup (requires inventory + master.yaml pre-configured)
./scripts/setup.sh k3s ha-setup

# Destroy
./scripts/setup.sh k3s destroy

# Utilities
./scripts/setup.sh k3s utils certmanager gitops dashboard

network — VirtualBox NAT Network

Subcommand Description
network hookup [name] [cidr] Create NAT network and attach VMs (default: KubewekendNet 10.0.69.0/24)
network return-nat Revert all VMs back to default NAT
network status List VirtualBox NAT networks
./scripts/setup.sh network hookup
./scripts/setup.sh network hookup MyNet 10.0.100.0/24
./scripts/setup.sh network return-nat
./scripts/setup.sh network status

config — Cluster Configuration

Subcommand Description
config show Print ansible/inventories/host_vars/master.yaml
config edit Open master.yaml in $EDITOR
config worker-show Print worker.yaml
config worker-edit Open worker.yaml in $EDITOR
./scripts/setup.sh config show
./scripts/setup.sh config edit

status — Project Dashboard

Shows Vagrant VMs, inventory groups, kubectl contexts, and Docker Kind containers.

./scripts/setup.sh status

quickstart — Guided Workflows

Subcommand Description
quickstart kind-local Kind on localhost (no Vagrant)
quickstart kind-vagrant Kind on Vagrant VMs
quickstart k3s-vagrant K3s on Vagrant VMs
quickstart k3s-remote K3s on remote VPS
./scripts/setup.sh quickstart kind-vagrant
./scripts/setup.sh quickstart k3s-remote

Global Options (kind / k3s)

These options are available on kind setup/destroy/utils and k3s setup/ha-setup/destroy/utils:

Option Description
--host, -H <name> Ansible host target (default: k8s-master-machine)
--dry-run Print the ansible command without executing
--skip-tags <tags> Comma-separated ansible tags to skip
--extra-vars <vars> Additional ansible extra-vars (key=value)
# Dry run
./scripts/setup.sh kind setup --dry-run

# Target a different host
./scripts/setup.sh k3s setup --host my-vps-node

# Skip specific tasks
./scripts/setup.sh kind setup --skip-tags setup_cni

# Pass extra variables
./scripts/setup.sh kind setup --extra-vars "kindCluster_image=kindest/node:v1.30.13"

Workflow Examples

1. Kind on Vagrant (VirtualBox)

# Provision VM
./scripts/setup.sh env init
./scripts/setup.sh vagrant up k8s-master-machine

# Generate inventory and verify
./scripts/setup.sh inventory generate
./scripts/setup.sh inventory ping

# Create Kind cluster
./scripts/setup.sh kind setup

# Add cert-manager + test ingress
./scripts/setup.sh kind utils certmanager ingress_test

# Teardown
./scripts/setup.sh kind destroy
./scripts/setup.sh vagrant destroy k8s-master-machine

2. K3s Standalone on Vagrant

# Provision master + worker
./scripts/setup.sh vagrant up k8s-master-machine k8s-worker-machine-1

# Generate inventory
./scripts/setup.sh inventory generate
./scripts/setup.sh inventory ping

# (Optional) Review / edit cluster config
./scripts/setup.sh config edit

# Setup K3s
./scripts/setup.sh k3s setup

# Get kubeconfig
ssh vagrant@192.168.56.99 'sudo cat /etc/rancher/k3s/k3s.yaml' > ~/.kube/config
# Replace 127.0.0.1 with 192.168.56.99 in the kubeconfig file

# Install utilities
./scripts/setup.sh k3s utils certmanager gitops ingress_test

# Teardown
./scripts/setup.sh k3s destroy
./scripts/setup.sh vagrant destroy

3. K3s on Remote VPS

# Interactive inventory wizard
./scripts/setup.sh inventory set-remote
# Enter: VPS IP, SSH port, SSH user, key path, number of workers

# Verify connectivity
./scripts/setup.sh inventory ping

# Edit cluster config (set tlsSANs, loadBalancer IP pool for your network)
./scripts/setup.sh config edit

# Setup K3s
./scripts/setup.sh k3s setup

# Get kubeconfig
ssh root@<vps-ip> 'sudo cat /etc/rancher/k3s/k3s.yaml' > ~/.kube/config
# Replace 127.0.0.1 with your VPS IP

# Install GitOps + cert-manager
./scripts/setup.sh k3s utils certmanager gitops

# Teardown
./scripts/setup.sh k3s destroy

4. K3s High-Availability (HA)

# 1. Edit inventory with HA groups
#    ansible/inventories/hosts needs:
#    [ha_master_init]    — exactly 1 bootstrap node
#    [ha_master_join]    — additional control-plane nodes
#    [ha_worker]         — agent nodes

# 2. Enable HA in config
./scripts/setup.sh config edit
# Set: k3sCluster.highAvailability.enable: true
# Set: k3sCluster.highAvailability.replicas: 3

# 3. Run HA setup
./scripts/setup.sh k3s ha-setup

# 4. Teardown
./scripts/setup.sh k3s destroy

5. Kind on Localhost (No Vagrant)

# Manually set inventory for localhost
cat > ansible/inventories/hosts <<'EOF'
[standalone-masters]
localhost ansible_host=127.0.0.1 ansible_connection=local

[standalone-all:children]
standalone-masters

[all:vars]
ansible_user=$USER
EOF

# Setup Kind
./scripts/setup.sh kind setup --host localhost

# Verify
kubectl cluster-info --context kind-kubewekend

# Cleanup
./scripts/setup.sh kind destroy

6. LGTM Observability Stack + Testing Application

Deploy the full LGTM observability stack (Prometheus + Grafana + Loki + Tempo + Pyroscope) and then run the bundled demo application to exercise traces, logs, metrics, and profiling together.

# 1. Spin up a cluster (Kind example)
./scripts/setup.sh vagrant up k8s-master-machine
./scripts/setup.sh inventory generate
./scripts/setup.sh kind setup

# 2. Install cert-manager first (required by kube-prometheus-stack CRDs)
./scripts/setup.sh kind utils certmanager

# 3. Deploy the full monitoring stack
#    Installs: kube-prometheus-stack, Alloy APM, Loki, Tempo, Pyroscope
./scripts/setup.sh kind utils monitoring

# 4. Build and deploy the LGTM demo application
docker build -t lgtm-testing-backend:latest examples/lgtm-testing/backend
docker build -t lgtm-testing-frontend:latest examples/lgtm-testing/frontend

kubectl apply -f examples/lgtm-testing/k8s/namespace.yaml
kubectl apply -f examples/lgtm-testing/k8s/postgres.yaml
kubectl apply -f examples/lgtm-testing/k8s/backend.yaml
kubectl apply -f examples/lgtm-testing/k8s/frontend.yaml
kubectl -n lgtm-testing wait --for=condition=ready pod \
  -l app.kubernetes.io/part-of=lgtm-testing --timeout=120s

# 5. Seed test data
kubectl -n lgtm-testing exec -it deploy/lgtm-testing-backend -- \
  curl -s -X POST http://localhost:8000/api/seed/

# 6. Generate traffic for each observability scenario
# Normal traces
kubectl -n lgtm-testing exec -it deploy/lgtm-testing-backend -- \
  curl -s http://localhost:8000/api/todos/?owner_id=1

# Auth failures → error spans in Tempo
kubectl -n lgtm-testing exec -it deploy/lgtm-testing-backend -- \
  curl -s -X POST http://localhost:8000/api/auth/login \
  -H "Content-Type: application/json" \
  -d '{"username":"alice","password":"WRONG"}'

# CPU flamegraph → Pyroscope
kubectl -n lgtm-testing exec -it deploy/lgtm-testing-backend -- \
  curl -s "http://localhost:8000/api/bottleneck/cpu-intensive?iterations=500000"

# 7. Open Grafana and explore (correlate Traces ↔ Logs ↔ Profiles)
#    Grafana is exposed via Ingress at grafana.local (configure in master.yaml)
echo "Access Grafana at: http://grafana.local"
echo "Data sources: Prometheus, Loki, Tempo, Pyroscope"

See examples/lgtm-testing/README.md for the full test scenario guide and custom metric reference.


Available Utility Tags

Used with kind utils <tags...> or k3s utils <tags...>:

Tag Description Playbook
ingress_test Deploy test nginx with ingress k8s-utilities-playbook.yaml
apigateway_test Deploy API Gateway test with weighted routing k8s-utilities-playbook.yaml
certmanager Install cert-manager (v1.19.2) k8s-utilities-playbook.yaml
dashboard Install K8s dashboard (kubernetes-dashboard / headlamp / rancher) k8s-utilities-playbook.yaml
storage Install Longhorn distributed block storage (v1.11.0) with optional iSCSI and NFS support k8s-utilities-playbook.yaml
secret_management Install Vault (v0.32.0) or OpenBao with auto-unseal, Vault Operator, and key persistence k8s-utilities-playbook.yaml
k8s_extensions Install Reflector, Reloader, External Secrets Operator k8s-utilities-playbook.yaml
gitops Install ArgoCD (v9.1.3) with Image Updater + Extensions, or Flux (v2.7.5) with Weave GitOps UI and Kargo k8s-utilities-playbook.yaml
security Install policy engine (Kyverno v3.7.1 or OPA Gatekeeper v3.22.0) and Dex identity provider (v0.24.0) k8s-utilities-playbook.yaml
idp Install Backstage Internal Developer Portal (v2.6.3) k8s-utilities-playbook.yaml
monitoring Install full LGTM stack: kube-prometheus-stack (v82.16.0), Alloy APM (v1.7.0), Loki (v9.5.1), Tempo (v1.26.7), Pyroscope (v1.19.2) k8s-utilities-playbook.yaml
service_mesh Install Istio service mesh (v1.29.1) k8s-utilities-playbook.yaml

Multiple tags can be combined:

./scripts/setup.sh kind utils certmanager dashboard gitops
./scripts/setup.sh k3s utils ingress_test apigateway_test k8s_extensions
# Full observability stack
./scripts/setup.sh kind utils monitoring
# Security + IDP
./scripts/setup.sh k3s utils security idp

Project Structure

scripts/
├── setup.sh              # Kubewekend CLI (this script)
├── README.md             # This documentation
└── legacy/               # Legacy v1 scripts (Kind only)
    ├── operate-kind-cluster.sh   # Auto-generate inventory from Vagrant
    ├── hook-up-ip.sh             # VirtualBox NAT network setup
    ├── return-to-nat.sh          # Revert VMs to default NAT
    └── README.md                 # Legacy documentation

ansible/
├── k3s-playbook.yaml             # K3s standalone setup
├── k3s-ha-playbook.yaml          # K3s HA setup (embedded etcd / external postgres)
├── k3s-remove-playbook.yaml      # K3s teardown
├── kind-playbook.yaml            # Kind setup (CNI, LB, ingress, gateway)
├── k8s-utilities-playbook.yaml   # Post-cluster utilities (cert-manager, vault, monitoring, gitops, etc.)
├── inventories/
│   ├── hosts                     # Ansible inventory (auto-generated or manual)
│   └── host_vars/
│       ├── master.yaml           # Master node config (Kind + K3s + utilities)
│       └── worker.yaml           # Worker node config (K3s only)
└── templates/
    ├── k3s-config.yaml.j2
    ├── kind-config.yaml.j2
    ├── kube-prometheus-stack-values.yaml.j2
    ├── alloy-values.yaml.j2
    ├── loki-values.yaml.j2
    ├── tempo-values.yaml.j2
    ├── pyroscope-values.yaml.j2
    ├── longhorn-iscsi-installation.yaml.j2
    ├── longhorn-nfs-installation.yaml.j2
    ├── ingress-test-deployment.yaml.j2
    └── apigateway-test-deployment.yaml.j2

examples/
└── lgtm-testing/                 # Full-stack LGTM observability demo app
    ├── backend/                  # FastAPI + OpenTelemetry + Pyroscope
    ├── frontend/                 # Nginx + HTML test dashboard
    ├── k8s/                      # Kubernetes manifests (namespace, postgres, backend, frontend)
    ├── docker-compose.yaml       # Local development (all services)
    └── docker-compose.k8s.yaml   # K8s-oriented compose variant

Configuration Files

File Purpose
.env SSH credentials (from template.env)
ansible/inventories/hosts Ansible inventory (hosts, groups, SSH config)
ansible/inventories/host_vars/master.yaml Primary config — Kind networking, K3s version/CNI/LB/ingress, utilities toggles
ansible/inventories/host_vars/worker.yaml Worker-specific config (K3s version, labels, taints)
ansible.cfg Ansible SSH options (host key checking disabled)
Vagrantfile VM definitions (master + 3 workers, VirtualBox)

Key settings in master.yaml:

# Kind
kindCluster.image: "kindest/node:v1.28.9"
kindCluster.ingress.class: "traefik"     # nginx | traefik | cilium | kong
kindCluster.loadbalancer.type: "cloud-provider-kind"  # metallb | cloud-provider-kind

# K3s
k3sCluster.version: "v1.34.5+k3s1"
k3sCluster.cni.type: "flannel"           # flannel | calico | cilium
k3sCluster.loadBalancer.type: "servicelb" # servicelb | metallb
k3sCluster.ingress.class: "traefik"
k3sCluster.highAvailability.enable: false # true for HA

# Utilities
utilities.certmanager.enable: true
utilities.storage.enable: true           # Longhorn distributed storage
utilities.storage.type: "longhorn"
utilities.gitops.type: "argocd"          # argocd | flux
utilities.dashboard.type: "headlamp"     # kubernetes-dashboard | headlamp | rancher
utilities.secretManagement.type: "vault" # vault | openbao
utilities.security.policyEngine.type: "kyverno"  # kyverno | opa-gatekeeper
utilities.security.identity.type: "dex"
utilities.idp.portal.type: "backstage"

# Monitoring (LGTM stack)
utilities.monitoring.enable: true
utilities.monitoring.type: "kube-prometheus-stack"
utilities.monitoring.apm.enable: true          # Alloy APM collector
utilities.monitoring.logging.enable: true      # Loki log aggregation
utilities.monitoring.tracing.enable: true      # Tempo distributed tracing
utilities.monitoring.profiling.enable: true    # Pyroscope continuous profiling
utilities.monitoring.apm.config.clusterName: "kubeweekend"

# Service Mesh
utilities.serviceMesh.enable: true
utilities.serviceMesh.type: "istio"

Contributing

When adding new features to the CLI:

  1. Add a new command function following the pattern: cmd_<name>() for dispatch + cmd_<name>_help() for documentation
  2. Register in the main() case statement
  3. Use parse_ansible_opts and run_ansible_playbook for any ansible-based operations
  4. Add confirmation via confirm() for destructive operations
  5. Update this README with the new command, subcommands, and examples
  6. Test with --dry-run before running against real infrastructure
# Pattern for adding: "monitoring" command
cmd_monitoring_help() { ... }
cmd_monitoring() {
    local sub="${1:-help}"; shift || true
    case "$sub" in
        setup)   monitoring_setup "$@" ;;
        *)       cmd_monitoring_help ;;
    esac
}
# Add to main(): monitoring) cmd_monitoring "$@" ;;