Files
myron 04fc9941ff Initial: backup/restore scripts, README
- backup.sh: weekly cron script for PVE1+PVE2
- restore.sh: interactive disaster recovery wizard
- README.md: step-by-step recovery guide for both nodes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-09 03:42:39 +00:00

177 lines
5.4 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Proxmox Config Backup & Restore
Automated weekly backup of PVE1 and PVE2 configuration to GitHub.
In a host failure, this gets you back up in 1530 minutes.
---
## Repository Structure
```
proxmox-config/
backup.sh ← runs on each node weekly, commits changes
restore.sh ← interactive restore on a fresh Proxmox install
README.md ← this file
shared/ ← cluster-wide configs (backed up from PVE1)
storage.cfg — storage definitions (local-lvm, GoFlex, ProxBackups)
datacenter.cfg — cluster settings
user.cfg — users and API tokens
corosync.conf — cluster membership
jobs.cfg — backup job schedules
vzdump.cron — vzdump cron config
firewall/ — cluster firewall rules
ha/ — HA group/resource configs
nodes/
pve/
qemu-server/ — PVE1 VM .conf files (100120)
lxc/ — PVE1 LXC container configs (110)
pve2/
qemu-server/ — PVE2 VM .conf files (106, 302)
pve/ ← PVE1-specific (hostname: pve, 10.48.200.90)
network/ — /etc/network/interfaces
cron/ — root crontab
scripts/ — custom shell/python scripts
systemd/ — custom service units
jarvis-agent/ — JARVIS monitoring agent config
ssh/ — authorized_keys (no private keys)
pve2/ ← PVE2-specific (hostname: pve2, 10.48.200.91)
(same structure)
```
---
## Backup Schedule
Both nodes run `backup.sh` via cron every **Sunday at 3:00 AM**.
```
0 3 * * 0 /usr/local/bin/proxmox-backup >> /var/log/proxmox-backup.log 2>&1
```
The script:
- Pulls latest from GitHub (handles both nodes committing)
- Collects all config files listed above
- Skips large binaries (ollama, filebrowser — see reinstall notes below)
- Skips private keys (`/etc/pve/priv/`, SSH private keys)
- Commits with a timestamp and pushes
---
## Disaster Recovery — Host Failure
### Scenario A: PVE1 (primary) fails
PVE1 hosts most VMs (100113, 118, 120). Storage for those VMs is on `local-lvm` (PVE1's LVM thin pool) or `GoFlex` NAS. VMs on local-lvm **are at risk** if the disk fails — restore from PBS.
**Steps:**
1. **Install fresh Proxmox** on replacement hardware.
- Set IP to `10.48.200.90` during install, or set after.
- Set hostname to `pve`.
2. **Clone this repo:**
```bash
apt install -y git
git clone https://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@github.com/myronblair/proxmox-config.git /opt/proxmox-config
```
3. **Run the restore script:**
```bash
bash /opt/proxmox-config/restore.sh pve1
```
The script is interactive — confirm each section as it goes.
4. **Reboot** to apply network config:
```bash
reboot
```
5. **Restore VMs from PBS** (Proxmox Backup Server at `10.48.200.85`):
- Log in to PVE web UI: `https://10.48.200.90:8006`
- Datacenter → Storage → Add → Proxmox Backup Server
- Server: `10.48.200.85`
- Datastore: `PBSBackup`
- Username: `root@pam`
- Fingerprint: see `shared/pbs_fingerprint.txt`
- For each VM: select the backup → Restore
6. **Reinstall large binaries** (not in git — too large):
```bash
# Ollama (local LLM on VM 210 inside this host)
curl -fsSL https://ollama.ai/install.sh | sh
ollama pull llama3.2
# File Browser
curl -fsSL https://raw.githubusercontent.com/filebrowser/get/master/get.sh | bash
# Config is at /root/.filebrowser.json or wherever it was set
```
7. **Start services:**
```bash
systemctl start jarvis-agent filebrowser
```
---
### Scenario B: PVE2 (secondary) fails
PVE2 hosts VM 106 (unknown) and VM 302 (NetworkBackup). PVE2 uses `local-lvm` for storage.
**Steps:**
1. **Install fresh Proxmox**, set IP `10.48.200.91`, hostname `pve2`.
2. **Clone repo and run restore:**
```bash
apt install -y git
git clone https://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@github.com/myronblair/proxmox-config.git /opt/proxmox-config
bash /opt/proxmox-config/restore.sh pve2
```
3. **Join the cluster from PVE1:**
```bash
# On PVE1:
pvecm add 10.48.200.91
# This auto-syncs all cluster state to PVE2
```
4. **Restore VMs 106 and 302 from PBS.**
5. **Reboot PVE2**, then verify cluster: `pvecm status`
---
## What Is NOT Backed Up Here
| Item | Where to find it |
|------|-----------------|
| VM disk images | Proxmox Backup Server (10.48.200.85) — runs nightly |
| /etc/pve/priv/ | Private CA key, auth keys — DO NOT store in git; regenerated by Proxmox on fresh install |
| SSH private keys | /root/.ssh/id_rsa — regenerate or restore from secure storage |
| ollama binary | Reinstall via install script (see above) |
| filebrowser binary | Reinstall via get.sh (see above) |
| blockalign, urbackupclientctl | Reinstall from their respective sources if needed |
---
## Nodes Quick Reference
| Node | Hostname | IP | Port | Role |
|------|----------|----|------|------|
| PVE1 | pve | 10.48.200.90 | 8006 | Primary — most VMs |
| PVE2 | pve2 | 10.48.200.91 | 8006 | Secondary — VM 106, 302 |
| PBS | — | 10.48.200.85 | 8007 | Backup server |
Proxmox web login: `root` / `Joker1974!!!`
FortiGate DDNS (PVE1 from internet): `orbisne.fortiddns.com`
---
## Manual Backup Trigger
To run a backup immediately on either node:
```bash
/usr/local/bin/proxmox-backup
tail -f /var/log/proxmox-backup.log
```