revised README.MD with k8s-infra deployment

2026-06-01 19:17:26 +02:00
parent 113de6fb46
commit f22cad1a37
1 changed files with 126 additions and 55 deletions
@@ -1,6 +1,6 @@
 # Homelab Infrastructure as Code

-A Pulumi-based IaC template for managing a Proxmox homelab. The goal is to replace manual GUI configuration and ad-hoc YAML stacks (LXC, VM, Docker, etc.) with version-controlled, reproducible infrastructure — starting with a highly available k3s cluster across multiple Proxmox nodes.
+A Pulumi-based IaC template for managing a Proxmox homelab. The goal is to replace manual GUI configuration and ad-hoc YAML stacks with version-controlled, reproducible infrastructure — a highly available k3s cluster across multiple Proxmox nodes, bootstrapped and configured end-to-end from code.

 This repo is intentionally abstract: credentials are never hardcoded, making it easy to fork and adapt as a template for your own homelab.

@@ -16,20 +16,30 @@ This repo is intentionally abstract: credentials are never hardcoded, making it

 ```
 .
-├── proxmox-infra/          # Pulumi TypeScript stack — VMs & LXC on Proxmox
-│   ├── index.ts            # All Pulumi resources
-│   ├── Pulumi.yaml         # Stack project definition
-│   ├── Pulumi.dev.yaml     # Encrypted stack config (gitignored)
-│   └── sdks/
-│       └── pfsense/        # Locally bundled @pulumi/pfsense SDK
-├── .gitea/
-│   └── workflows/
-│       └── deploy-proxmox-infra.yaml   # Gitea Actions CI/CD pipeline
+├── proxmox-infra/          # Stack 1 — VMs & DHCP on Proxmox + pfSense
+│   ├── index.ts
+│   ├── Pulumi.yaml
+│   └── sdks/pfsense/       # Locally bundled @pulumi/pfsense SDK
+├── k8s-bootstrap/          # Stack 2 — Install k3s on the VMs over SSH
+│   ├── index.ts
+│   └── Pulumi.yaml
+├── k8s-infra/              # Stack 3 — Cluster-level infrastructure via Helm
+│   ├── index.ts
+│   └── Pulumi.yaml
+└── .gitea/workflows/       # Gitea Actions — one workflow per stack
 ```

-## Current stack: `proxmox-infra`
+## Stack overview

-Provisions a 5-node k3s cluster spread across two Proxmox hosts (`pve` and `pve-bckp`, third bare metal host to be added later for actual parity):
+The three stacks are deployed in order and linked via Pulumi StackReferences — outputs from one stack flow automatically into the next.
+
+```
+proxmox-infra  →  k8s-bootstrap  →  k8s-infra
+```
+
+### Stack 1: `proxmox-infra`
+
+Provisions a 5-node k3s cluster spread across two Proxmox hosts:

 | VM name      | Role   | Proxmox node |
 | ------------ | ------ | ------------ |
@@ -39,43 +49,64 @@ Provisions a 5-node k3s cluster spread across two Proxmox hosts (`pve` and `pve-
 | k3s-master-3 | master | pve-bckp     |
 | k3s-worker-2 | worker | pve-bckp     |

-Each node is a full clone of an Ubuntu Noble (24.04) cloud-image template, with cloud-init injecting hostname, user credentials, and SSH keys at boot. Each VM's MAC address is registered as a DHCPv4 static mapping in pfSense so that nodes always receive their designated IPs.
+Each VM is a full clone of an Ubuntu Noble (24.04) cloud-image template. Cloud-init injects the hostname, user credentials, and SSH keys at boot. An ED25519 SSH keypair is generated once and stored in Pulumi state — the public key goes into every VM, the private key is exported as a stack output for `k8s-bootstrap` to consume via StackReference.

-An ED25519 SSH key pair is generated once and stored in Pulumi state. The public key is injected into every VM at boot; the private key is exported as a stack output so `k8s-bootstrap` can consume it via StackReference without any manual key distribution.
+Each VM's MAC address is registered as a DHCPv4 static mapping in pfSense so nodes always receive their designated IPs.

-**Tech stack:**
+**Providers:** `@muhlba91/pulumi-proxmoxve` v8.x, `@pulumi/pfsense` (locally bundled), `@pulumi/tls`

- [Pulumi](https://www.pulumi.com/) with TypeScript
- [`@muhlba91/pulumi-proxmoxve`](https://github.com/muhlba91/pulumi-provider-proxmoxve) v8.x community provider
- [`@pulumi/pfsense`](https://github.com/marshallford/terraform-provider-pfsense) — locally bundled SDK bridged from the Terraform pfSense provider; installed automatically via `npm install`
- [`@pulumi/tls`](https://www.pulumi.com/registry/packages/tls/) — SSH key pair generation
- Self-hosted Pulumi state backend (PostgreSQL)
- Gitea Actions for CI/CD
+**Exports:** `vmIds`, `ciRunnerPrivateKey`, `pve1Endpoint`, `pve1ApiToken`, `pve2Endpoint`, `pve2ApiToken`, `master1Ip` … `worker2Ip`
+
+### Stack 2: `k8s-bootstrap`
+
+Installs k3s on the Proxmox VMs over SSH using `@pulumi/command`. Reads VM IDs, Proxmox credentials, and the SSH private key from `proxmox-infra` via StackReference — no manual key distribution needed.
+
+Deployment sequence:
+
+1. Start all 5 VMs via the Proxmox REST API
+2. Wait for port 22 to open on each node
+3. Install k3s on `k3s-master-1` with `--cluster-init` (embedded etcd)
+4. Join `k3s-master-2` and `k3s-master-3` as additional etcd nodes
+5. Join `k3s-worker-1` and `k3s-worker-2` as agent nodes
+6. Read `/etc/rancher/k3s/k3s.yaml` from master-1, patch the server URL, export as a secret stack output
+
+**Exports:** `kubeconfig` (secret)
+
+### Stack 3: `k8s-infra`
+
+Deploys cluster-level infrastructure via Helm and the `@pulumi/kubernetes` provider. Reads `kubeconfig` from `k8s-bootstrap` via StackReference.
+
+| Component            | What it does                                                               |
+| -------------------- | -------------------------------------------------------------------------- |
+| NFS CSI Driver       | `csi-driver-nfs` Helm chart (v4.12.0) — enables dynamic NFS-backed PVCs    |
+| TrueNAS StorageClass | `truenas-nfs` StorageClass backed by an NFS share on TrueNAS               |
+| cert-manager         | `cert-manager` Helm chart — certificate lifecycle management               |
+| ClusterIssuer        | `letsencrypt-prod` — DNS-01 via Cloudflare, issues Let's Encrypt TLS certs |

 ## Prerequisites

 - [Pulumi CLI](https://www.pulumi.com/docs/install/) installed
 - Node.js 18+ and npm
- Access to a Proxmox node with an API token
- pfSense instance with API credentials (used for DHCPv4 static mappings)
+- Two Proxmox nodes with API tokens
+- pfSense with REST API credentials (used for DHCPv4 static mapping)
+- TrueNAS with an NFS share (no API key needed — CSI driver connects via NFS protocol directly)
+- A Cloudflare account with an API token scoped to DNS edit on your zone
 - A self-hosted Pulumi state backend (PostgreSQL connection string)
 - Gitea instance for CI/CD (optional for local use)

 ## Getting started

-### 1. Clone and install
+Deploy the stacks **in order**. Each stack must be fully deployed before the next one runs.
+
+### Stack 1 — `proxmox-infra`

 ```bash
-git clone <your-repo-url>
 cd proxmox-infra
-npm install
+npm install      # also compiles the bundled pfSense SDK
+pulumi stack init dev
 ```

-> **pfSense SDK** — The `@pulumi/pfsense` SDK is bundled locally under `sdks/pfsense/` and referenced as a `file:` dependency in `package.json`. Running `npm install` compiles it automatically via its postinstall hook. No separate installation or build step is required.
-
-### 2. Configure credentials
-
-All secrets are stored as encrypted Pulumi config values — never in plain environment variables or committed files.
+Set secrets:

 ```bash
 # Proxmox API credentials
@@ -101,31 +132,68 @@ pulumi config set --secret master3Ip  <ip-for-k3s-master-3>
 pulumi config set --secret worker2Ip  <ip-for-k3s-worker-2>
 ```

-Pulumi encrypts these values into `Pulumi.dev.yaml` using your `PULUMI_CONFIG_PASSPHRASE`.
-
-### 3. Set the state backend
+Deploy:

 ```bash
 export PULUMI_BACKEND_URL=postgresql://<user>:<pass>@<host>/<db>
 export PULUMI_CONFIG_PASSPHRASE=<your-passphrase>
+
+pulumi preview    # inspect before touching anything
+pulumi up --yes
 ```

-### 4. Preview and deploy
+### Stack 2 — `k8s-bootstrap`

 ```bash
-# See what will change before touching anything
-pulumi preview
+cd ../k8s-bootstrap
+npm install
+pulumi stack init dev

-# Sync Pulumi state with actual Proxmox state (run after any manual GUI changes)
-pulumi refresh --yes
+pulumi config set --secret k3sToken "$(openssl rand -hex 32)"
+```

-# Deploy
-pulumi refresh --yes && pulumi up --yes
+Node IPs and Proxmox credentials are read automatically from `proxmox-infra` outputs via StackReference — do not set them here.
+
+```bash
+pulumi up --yes
+```
+
+### Stack 3 — `k8s-infra`
+
+**TrueNAS one-time setup** (before deploying this stack):
+
+1. Create dataset `tank/k8s`
+2. Add an NFS share for that dataset
+
+    - Acl Type: Set this to POSIX (or generic Unix permissions) rather than SMB/NFSv4 ACLs if you are on TrueNAS SCALE, as Kubernetes handles basic Unix permissions natively.
+    - Maproot User / Maproot Group: Set this to root and root
+    - Allowed Hosts/Networks: Restrict this share specifically to the IP addresses or the CIDR block of your Proxmox K3s nodes (e.g., 192.168.1.50)
+
+3. In Network → Allowed Networks, permit your LAN subnet (e.g. `192.168.1.0/24`)
+
+```bash
+cd ../k8s-infra
+npm install
+pulumi stack init dev
+
+# kubeconfig from k8s-bootstrap
+KUBECONFIG=$(cd ../k8s-bootstrap && pulumi stack output kubeconfig --show-secrets)
+pulumi config set --secret kubeconfig "$KUBECONFIG"
+
+# TrueNAS NFS
+pulumi config set --secret truenasHost    <truenas-ip>
+pulumi config set --secret truenasNfsPath /mnt/tank/k8s
+
+# cert-manager + Cloudflare DNS-01
+pulumi config set --secret cloudflareApiToken <cf-token>
+pulumi config set --secret letsencryptEmail   <your-email>
+
+pulumi up --yes
 ```

 ## CI/CD (Gitea Actions)

-The workflow at `.gitea/workflows/deploy-proxmox-infra.yaml` runs automatically:
+One workflow per stack under `.gitea/workflows/`. Each triggers on changes to its own stack directory.

 | Event                 | Action                         |
 | --------------------- | ------------------------------ |
@@ -135,34 +203,37 @@ The workflow at `.gitea/workflows/deploy-proxmox-infra.yaml` runs automatically:

 ### Required Gitea secrets

-Configure these under **Settings → Actions → Secrets** in your Gitea repo:
+Configure under **Settings → Actions → Secrets**:

-| Secret                     | Description                                        |
-| -------------------------- | -------------------------------------------------- |
-| `PULUMI_BACKEND_URL`       | PostgreSQL connection string for the state backend |
-| `PULUMI_CONFIG_PASSPHRASE` | Passphrase to decrypt secrets in `Pulumi.dev.yaml` |
-| `PULUMI_DEV_YAML`          | Base64-encoded content of `Pulumi.dev.yaml`        |
+| Secret                          | Used by                     | Description                                    |
+| ------------------------------- | --------------------------- | ---------------------------------------------- |
+| `PULUMI_BACKEND_URL`            | all workflows               | PostgreSQL connection string for state backend |
+| `PULUMI_CONFIG_PASSPHRASE`      | all workflows               | Passphrase to decrypt secrets                  |
+| `PROXMOX_INFRA_PULUMI_DEV_YAML` | `deploy-proxmox-infra.yaml` | Base64-encoded `proxmox-infra/Pulumi.dev.yaml` |
+| `K8S_BOOTSTRAP_PULUMI_DEV_YAML` | `deploy-k8s-bootstrap.yaml` | Base64-encoded `k8s-bootstrap/Pulumi.dev.yaml` |
+| `K8S_INFRA_PULUMI_DEV_YAML`     | `deploy-k8s-infra.yaml`     | Base64-encoded `k8s-infra/Pulumi.dev.yaml`     |

-`Pulumi.dev.yaml` is gitignored because it contains your encryption salt. Whenever it changes (e.g. after adding or rotating a secret), re-encode it and paste the output into the Gitea secret:
+`Pulumi.dev.yaml` files are gitignored (they contain your encryption salt). Whenever one changes, re-encode and paste into the corresponding Gitea secret:

 ```bash
 base64 -w 0 proxmox-infra/Pulumi.dev.yaml
+base64 -w 0 k8s-bootstrap/Pulumi.dev.yaml
+base64 -w 0 k8s-infra/Pulumi.dev.yaml
 ```

 ## Adapting this as a template

 1. Fork or copy the repo
-2. Update node names (`pve`, `pve-bckp`) and datastore IDs in `index.ts` to match your setup
+2. Update Proxmox node names (`pve`, `pve-bckp`) and datastore IDs in `proxmox-infra/index.ts`
 3. Add or remove VMs from the `nodeConfigs` array
 4. Set your own secrets with `pulumi config set --secret`
-5. Point the CI/CD workflow at your own Git instance
+5. Point the CI/CD workflows at your own Gitea instance

 ## Roadmap

 - LXC container management
+- `k8s-apps` stack — application deployments on the cluster
 - Docker / Compose stack provisioning
- Firewall rules (pfSense)
- Automated k3s bootstrapping (kubeconfig export)
- Additional worker nodes and storage volumes
+
 - Migrate secrets management to [OpenBao](https://openbao.org/) — replace `PULUMI_CONFIG_PASSPHRASE` and manual `Pulumi.dev.yaml` encoding with a self-hosted vault
- Add a third bare metal proxmox instance to create an actual 3 node parity.
+- Add a third bare metal Proxmox node for true 3-node HA parity