Compare commits

...

5 Commits

Author SHA1 Message Date
myron e9af102dbe Update JARVIS TODO: round 2 fixes (HA poller, DB tables, web.orbishosting.com)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-29 15:53:20 -05:00
myron 27df1eb7c9 Update JARVIS TODO with comprehensive agent deployment plan
- Full prioritized TODO: critical fixes, agent deployment checklist
  for every host (DO, Proxmox VMs, Windows, Mac, Alpine), self-healing config
- Windows agent installer details, compiled exe build instructions
- HA VM109 post-rebuild checklist

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-29 13:57:47 -05:00
myron 79db9f1a55 Update MediaStack: NordVPN tunnel via CT110, exit IP 2.56.190.69 2026-06-29 13:22:24 -05:00
myron a467a1de8f Update CT110 WireGuard pubkey in MediaStack memory (was RXxD, now Fqb1) 2026-06-29 12:45:45 -05:00
myron 225259f1f4 Mark Gitea NAS mirror as complete; remove stale project_claude_vm.md 2026-06-29 12:01:12 -05:00
4 changed files with 246 additions and 26 deletions
-18
View File
@@ -1,18 +0,0 @@
---
name: project-claude-vm
description: Claude Code runs on PVE1 VM 107 (Claude-DHCP) at 10.48.200.29 — the local environment for all Claude Code sessions
metadata:
node_type: memory
type: project
originSessionId: 16664adb-5228-4a2a-bffb-7e783ad13af1
---
Claude Code sessions run inside **PVE1 VM 107** named "Claude-DHCP".
- **IP**: 10.48.200.29 (DHCP — may change on reboot)
- **Hostname**: claude
- **Hypervisor**: PVE1 (10.48.200.90 / orbisne.fortiddns.com)
- **No JARVIS agent installed** on this VM
**Why:** This explains why SSH to local IPs works directly, and why the working directory is /home/myron on a Linux VM rather than a remote machine.
**How to apply:** When commands run "locally," they run inside this VM on the 10.48.200.0/24 LAN. Direct SSH to other LAN machines works without relaying through DO.
+35 -1
View File
@@ -9,12 +9,42 @@ metadata:
# Infrastructure TODO
Last updated: 2026-06-24
Last updated: 2026-06-28
---
## 🔴 OPEN
- [x] **Synology iSCSI → Proxmox storage** — COMPLETE 2026-06-27. SynologyLVM (lvmthin, 1.86TB) active. SynologyiSCSI raw device also added. NAS at 10.48.200.249, IQN: iqn.2000-01.com.synology:NAS.Target-1.6296e09c4cb. Set as default Proxmox storage. NAS hostname fixed in /etc/hosts (was resolving to Tailscale IP — root cause of past VM corruptions). SynologyProx CIFS stays for backups/ISOs.
- [ ] **FortiGate DNS + Synology Reverse Proxy for all VMs** — Use Synology's built-in Reverse Proxy (DSM → Control Panel → Application Portal → Reverse Proxy) instead of NPM. FortiGate DNS overrides point all .lan domains → 10.48.200.249 (Synology). NPM kept but no longer primary.
- **Step 1 — FortiGate DNS**: https://192.168.20.1 (admin / Joker1974!!!) → Network → DNS → Local DNS Records. Each .lan entry → 10.48.200.249
- **Step 2 — Synology Reverse Proxy rules** (DSM → Control Panel → Application Portal → Reverse Proxy):
| Source FQDN | Destination IP | Port | Notes |
|------------|----------------|------|-------|
| proxmox.lan | 10.48.200.90 | 8006 | HTTPS backend, enable WebSocket |
| jarvis.lan | 10.48.200.211 | 80 | HTTP |
| hoa.lan | 10.48.200.97 | 8123 | HTTP, **enable WebSocket** (HA requires it) |
| homebridge.lan | 10.48.200.18 | 8581 | HTTP |
| jellyfin.lan | 10.48.200.33 | 8096 | HTTP, enable WebSocket |
| novacpx.lan | 10.48.200.110 | 8882 | HTTPS backend |
| sonarr.lan | 10.48.200.35 | 8989 | HTTP |
| radarr.lan | 10.48.200.35 | 7878 | HTTP |
| qbit.lan | 10.48.200.35 | 8080 | HTTP |
| ollama.lan | 10.48.200.210 | 11434 | HTTP |
| npm.lan | 10.48.200.200 | 81 | HTTP |
| nas.lan | 10.48.200.249 | 5001 | HTTPS (DSM itself) |
- **Step 3 — Client DNS**: Set Windows DNS to FortiGate (192.168.20.1) or PVE1 (10.48.200.90) so .lan resolves
- **WebSocket**: Must be enabled on proxmox.lan, hoa.lan, jellyfin.lan rules or those UIs will break
- [ ] **Home Assistant VM109 post-boot setup** — HA is booting (supervisor starting). Once port 8123 is up:
1. Restore Google Drive backup (file ID: `1mLE1S9dSvxl0RYQnCt020WT-UZnQuxqP`)
2. Install Tailscale addon (go to Supervisor > Add-on Store)
3. Re-integrate JARVIS ↔ HA (212 entities)
4. Resize disk from 32GB → 150GB (`qm resize 109 sata0 +118G` while VM stopped, then resize partition inside HA)
- [x] **CT110 WireGuard filesystem read-only** — fsck run, filesystem clean and rw. wg-clients.conf updated with new MediaStack pubkey. 2026-06-24.
- [x] **CT110 wg-clients auto-start** — added `/etc/local.d/wg-clients.start` (OpenRC local service). wg-clients comes up on boot. 2026-06-24.
@@ -29,6 +59,10 @@ Last updated: 2026-06-24
- [x] **MediaStack backup to new storage** — VM 103 disk now on GoFlex storage. Backup job runs nightly at 21:00 to SynologyProx and backs up VM regardless of disk location. Verified 2026-06-24.
- [x] **NAS Git Server — Hybrid Mirror Setup** — COMPLETE 2026-06-29. Gitea 1.26.4 (ARM64) on Synology NAS at 10.48.200.249:3000, HTTPS at gitea.orbishosting.com. All 25 GitHub repos mirrored (every 8h). 4 private NAS-only repos: infra-private, fortigate-config, proxmox-secrets, jarvis-secrets. Auto-starts on boot via /usr/local/etc/rc.d/gitea.sh. Added to web.orbishosting.com dashboard.
- [x] **Synology NAS → FortiSwitch** — COMPLETE 2026-06-28. NAS LAN2 → FortiSwitch Port 6, NAS LAN1 → FortiSwitch Port 7. Bonding configured as **Adaptive Load Balancing (ALB)** in Synology DSM (802.3ad LACP not available on FortiGate 60F FortiOS for managed FortiSwitch via CLI or GUI). ALB provides outbound load balancing + redundancy without switch LACP support. NAS remains at 10.48.200.249.
---
## ✅ COMPLETED (2026-06-24 session)
+207 -4
View File
@@ -1,19 +1,222 @@
---
name: project-jarvis-todo
description: "Master TODO list for JARVIS system — current open items and completed work"
description: "Master JARVIS TODO — agent deployment, self-healing, Windows service, HA integration, all outstanding work"
metadata:
node_type: memory
type: project
originSessionId: 4420b39a-7b7f-439f-9321-ff0daf1d663d
originSessionId: e8442c3a-86d9-4b82-8f6d-071acd19159a
---
# JARVIS Master TODO
Last updated: 2026-06-18
Last updated: 2026-06-29
---
## 🔴 OPEN
## ✅ FIXED THIS SESSION (2026-06-29) — ROUND 2
- [x] **HA Poller deployed on VM211**`jarvis-ha-poller.py` running as `jarvis-ha-poller.service`. Polls HA at `http://10.48.200.97:8123` every 30s. 241 entities now pushing (lights, switches). Token expires 2033.
- [x] **Missing DB tables created**`tasks`, `appointments`, `usage_patterns` tables added. Fixed `registered_agents` enum to include `windows`/`macos` and `version` column.
- [x] **schema.sql updated** — DB schema dumped from live VM211 to `db/schema.sql`, now includes all tables.
- [x] **ha.php domain filter** — Added `camera`, `siren`, `remote`, `todo`, `lawn_mower` to `$skipDomains`. Only `light` and `switch` (plugs) show in HOME tab.
- [x] **web.orbishosting.com fixed** — Root cause: Epson printer had ARP conflict at 10.48.200.200 (NPM's IP). FortiGate VIP for HTTP/HTTPS correctly forwards to 10.48.200.200. Fixed by: (1) bouncing NPM's eth0 to send gratuitous ARP, (2) setting permanent ARP entry on PVE1. https://web.orbishosting.com → NPM → NovaCPX (returns 401 Basic Auth — NovaCPX site auth).
- [x] **PVE1 static ARP**`/etc/network/if-up.d/npm-static-arp` persists `10.48.200.200 → BC:24:11:67:1D:47` across reboots.
## ✅ FIXED THIS SESSION (2026-06-29) — ROUND 1
- [x] **JARVIS API not executing PHP** — nginx `location ^~ /api` (no trailing slash) was intercepting `/api.php` and serving it as static text. Fixed to `location ^~ /api/`. PHP-FPM at `/run/php/php8.3-fpm.sock` confirmed working. `/api/ping` now returns JSON.
- [x] **api.php backward-compat path normalization** — Added path rewrite so old `/api/endpoints/agent.php` format routes to the `agent` endpoint. Agents on old configs can now register.
- [x] **DB schema: version + windows/macos** — Added `version VARCHAR(32)` to `registered_agents`; expanded `agent_type` enum to `linux|homeassistant|proxmox|windows|macos`.
- [x] **Ollama models in config.php**`OLLAMA_MODEL_PRIMARY` and `OLLAMA_MODEL_HEAVY` both set to `llama3.1:8b`. VM106 Ollama upgraded to 32GB RAM, 8 cores.
- [x] **Windows agent installer** — Created `install-windows.ps1`: one PowerShell command installs Python, pywin32, downloads agent, creates config, registers+starts Windows Service. No open PowerShell needed after install.
- [x] **Linux installer URL**`install.sh` was hardcoded to `http://10.48.200.211` (LAN only). Fixed to default `https://jarvis.orbishosting.com`. LAN installs override with `JARVIS_URL=http://10.48.200.211`.
---
## 🔴 CRITICAL — Outstanding
### A. Epson Printer IP Conflict — NEEDS PERMANENT FIX
Epson printer keeps taking 10.48.200.200 (NPM's static IP). Temporary fix: PVE1 has static ARP + gratuitous ARP from NPM. **Real fix**: assign Epson printer a different static IP in its web admin (find printer IP when it comes back up, log in to its config page, set static IP ≠ 10.48.200.200). DHCP reservation in FortiGate DHCP server for printer's MAC also works.
### B. Windows Agent on Myron's Desktop — NEEDS ADMIN POWERSHELL
Run this in an **Admin PowerShell** (not Claude Code terminal):
```powershell
$env:JARVIS_REG_KEY='f846a9aaf7ce9a61742c63c87c4186052a71d2a580c65518'
& 'C:\Users\myron\repos\jarvis\public_html\agent\install-windows.ps1'
```
After install: `Get-Service JARVISAgent` should show Running.
### C. web.orbishosting.com NovaCPX 401
Site routes correctly to NovaCPX but returns `401 Basic realm="Blair HQ"`. Need to check CyberPanel on NovaCPX (10.48.200.110) — either the website isn't created for web.orbishosting.com, or HTTP auth is enabled on the default site. Access CyberPanel at https://10.48.200.110:8090 to check.
### D. Re-install JARVIS HA Custom Component (VM109 rebuilt)
```bash
# From PVE1, get HA terminal or use Proxmox console for VM109:
# Copy from JARVIS server to HA config:
ssh root@10.48.200.211 'tar czf /tmp/ha-component.tgz -C /var/www/jarvis ha-component'
# Then on HA VM or via PVE1 -> VM109 console:
# mkdir -p /config/custom_components
# tar xzf ha-component.tgz -C /config/
# Restart HA
```
After restart: `ha_entities` should fill. Also restore HA backup file ID `1mLE1S9dSvxl0RYQnCt020WT-UZnQuxqP` from Google Drive via HA UI.
### B. Push to GitHub + Verify Auto-Deploy
The fixes in this session need to be committed and pushed so VM211 picks them up (webhook deploy):
```bash
cd C:\Users\myron\repos\jarvis
git add -A && git commit -m "Fix nginx PHP, API paths, Windows installer, install.sh URL"
git push
```
After push: verify VM211 auto-deployed (`journalctl -u jarvis-deploy -n 20` on VM211).
### C. Install Agent on Ollama VM106 (10.48.200.210) — new VM, no agent
```bash
# From PVE1:
ssh root@10.48.200.210 "curl -sk https://jarvis.orbishosting.com/agent/install.sh | bash -s ollama106 linux"
```
---
## 🟠 HIGH — Deploy Agents to All Hosts
**Linux/Proxmox (LAN) one-liner:**
```bash
JARVIS_URL=http://10.48.200.211 curl -sk https://jarvis.orbishosting.com/agent/install.sh | bash -s <hostname> <linux|proxmox>
```
**Linux (External/DO):**
```bash
curl -sk https://jarvis.orbishosting.com/agent/install.sh | bash -s <hostname> linux
```
**Windows (Admin PowerShell):**
```powershell
$env:JARVIS_REG_KEY='f846a9aaf7ce9a61742c63c87c4186052a71d2a580c65518'
irm https://jarvis.orbishosting.com/agent/install-windows.ps1 | iex
```
**Mac:**
```bash
curl -sk https://jarvis.orbishosting.com/agent/install-mac.sh | bash -s -- --key f846a9aaf7ce9a61742c63c87c4186052a71d2a580c65518
```
**Deployment status table:**
| Host | IP | Type | Status | Action Needed |
|------|-----|------|--------|--------|
| PVE1 | 10.48.200.90 | proxmox | ❓ check | Verify in Workers tab |
| PVE2 | 10.48.200.91 | proxmox | ❓ check | Verify in Workers tab |
| JARVIS VM211 | 10.48.200.211 | linux | ❓ check | Self-monitors |
| Ollama VM106 | 10.48.200.210 | linux | ❌ missing | Install now (see Critical C) |
| Jellyfin VM112 | 10.48.200.33 | linux | ❓ check | Verify in Workers tab |
| MediaStack VM103 | 10.48.200.35 | linux | ❓ check | Verify in Workers tab |
| HomeBridge VM118 | 10.48.200.18 | linux | ❓ check | Verify in Workers tab |
| NovaCPX VM120 | 10.48.200.110 | linux | ❓ check | Verify in Workers tab |
| SynchroNet VM100 | 10.48.200.50 | linux | ❌ missing | Install if SSH accessible |
| NetworkBackup | 10.48.200.99 | linux | ❓ check | Verify in Workers tab |
| HA VM109 | 10.48.200.97 | homeassistant | ❌ missing | See Critical A |
| CT110 WireGuard | 10.48.200.67 | linux | ❌ missing | apk add python3, install agent |
| DO Server | 165.22.1.228 | linux | ❓ check | Verify in Workers tab |
| Myron's Desktop (this PC) | DHCP | windows | ❌ missing | Run install-windows.ps1 |
| Windows VM104 | DHCP | windows | ❌ missing | Run install-windows.ps1 |
| mini_it12 | 10.48.200.87 | windows | ❌ offline | Last seen 2026-06-12, re-install |
| Mac machines | any | macos | ❌ missing | Run install-mac.sh |
---
## 🟠 HIGH — Self-Healing Details
**Linux** — systemd `Restart=always` + `RestartSec=10`. Already built into install.sh service file.
**Windows** — After install, run once to add recovery actions:
```powershell
sc.exe failure JARVISAgent reset=86400 actions=restart/10000/restart/30000/restart/60000
```
**Self-update** — All agents auto-update every 24h. Push new agent code to GitHub → VM211 deploys → all agents pick up update within 24h. For urgent: JARVIS Admin → Workers → UPDATE button per agent.
**Alpine/OpenRC** (CT110):
```sh
# /etc/local.d/jarvis-agent.start
nohup sh -c 'while true; do python3 /opt/jarvis-agent/jarvis-agent.py; sleep 10; done' &
```
---
## 🟠 HIGH — Windows Agent Compiled Executable
Current installer works (installs Python silently) but a standalone `.exe` is cleaner for restricted machines.
Build steps (run on any Windows machine with Python):
```powershell
pip install pyinstaller pywin32
pyinstaller --onefile --hidden-import=win32timezone --hidden-import=win32security `
C:\Users\myron\repos\jarvis\agent\jarvis-agent-windows.py -n jarvis-agent
# Upload dist/jarvis-agent.exe to VM211:/var/www/jarvis/public_html/agent/
```
Then update `install-windows.ps1` to download the exe and use `.\jarvis-agent.exe --startup auto install`.
---
## 🟡 MEDIUM — HA Integration (VM109)
Full post-rebuild checklist:
- [ ] Restore Google Drive backup (`1mLE1S9dSvxl0RYQnCt020WT-UZnQuxqP`)
- [ ] Re-install JARVIS HA component (see Critical A)
- [ ] Install Tailscale addon (Settings → Add-ons → Search Tailscale)
- [ ] Resize disk: power off → `qm resize 109 sata0 +118G` → boot → `ha os datadisk resize`
- [ ] Verify `HA_URL=http://orbisne.fortiddns.com:8123` in config.php
- [ ] After component reinstall: verify `ha_entities` fills (~2587 rows)
**HA HOME tab filter** — after backup restore, audit `ha.php` `$skipKeywords` to ensure Tuya/TP-Link/Z-Wave switches aren't being filtered out.
---
## 🟡 MEDIUM — JARVIS Server Health Checks
```bash
# SSH to VM211:
# Verify cron jobs
crontab -l
# Should have: */5 * * * * php /var/www/jarvis/api/endpoints/stats_cache.php
# */3 * * * * php /var/www/jarvis/api/endpoints/facts_collector.php
# Arc Reactor status
systemctl status jarvis-arc
curl -s http://10.48.200.211:7474/health
# Backups
ls -lh /var/backups/jarvis/
# JARVIS ping
curl -s https://jarvis.orbishosting.com/api/ping
```
---
## 🟢 LOW — FortiGate DNS + Synology Reverse Proxy
Adds `.lan` domain access. Full instructions in `project_infra_todo.md`.
Key entries: jarvis.lan → VM211:80, proxmox.lan → 10.48.200.90:8006, hoa.lan → VM109:8123.
---
## 🔑 Key Constants
| Item | Value |
|------|-------|
| Registration key | `f846a9aaf7ce9a61742c63c87c4186052a71d2a580c65518` |
| JARVIS URL (external) | `https://jarvis.orbishosting.com` |
| JARVIS LAN | `http://10.48.200.211` |
| HA Token | `eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...` (in config.php, expires 2026) |
| Proxmox API token | `root@pam!jarvis=c45b5feb-f9a9-445d-a626-14fbb959f78b` |
---
## 🔴 OPEN (pre-existing)
- [ ] **HA HOME tab — show all lights, plugs, switches** — currently 17 lights show but only 3 useful switches (Sirens, CEC Scanner, ESPHome) pass the filter. Need to audit what real smart plug/switch entities exist in HA (Tuya, TP-Link Tapo, Z-Wave, etc.) and ensure they appear in the JARVIS HOME tab. The `$skipKeywords` filter in `api/endpoints/ha.php` strips camera/HACS junk — verify real device switches aren't accidentally filtered. Also check if HA custom component is syncing all entity domains.
+4 -3
View File
@@ -30,14 +30,15 @@ All services run as root (NFS ACL requires root for writes).
### wg0 — Internet kill-switch (primary VPN)
- **Interface:** `wg0` | **VPN IP:** `10.200.0.4/24`
- **Endpoint:** CT110 at `10.48.200.67:51821`DO server (165.22.1.228) → internet
- **Exit IP:** `165.22.1.228` (DO server, verified 2026-06-24)
- **Endpoint:** CT110 at `10.48.200.67:51821`NordVPN (us9156, 2.56.190.66:51820) → internet
- **Exit IP:** `2.56.190.69` (NordVPN US, verified 2026-06-29)
- **Kill-switch:** iptables rules — REJECT all non-wg0 non-fwmark traffic; LAN 10.48.200.0/24 always allowed
- **Config:** `/etc/wireguard/wg0.conf` — fwmark hardcoded as `51820` (not dynamic, avoids PostDown race)
- **Auto-start:** `systemctl enable wg-quick@wg0` (enabled 2026-06-24)
- **DNS:** `10.48.200.90` (PVE1 dnsmasq)
- **MediaStack pubkey:** `CaG79S1fJeJDlYCMhHz8BrDfizBq+OiGnO5VzFIk3gE=`
- **CT110 pubkey:** `RXxDgIAaie4n0BxBA48rlmt9BJyp2GEktENeQDlc4hA=`
- **CT110 pubkey:** `Fqb1KLfHe1r3+Hwhem7YGZB2KikGYy/8pPsOIP4rn18=` (updated 2026-06-29 — old key was RXxD...)
- **NordVPN exit IP:** 2.56.190.69 (us9156.nordvpn.com) — verified 2026-06-29
### wg1 — Jellyfin media access (NOT internet VPN)
- MediaStack is WireGuard server on `wg1` (port 51820, 10.200.0.1/24)