Scope: This checklist targets bare-metal / VM deployments managed by
systemd(whatinstall.shproduces). Container / Kubernetes deployments require a separate hardening review — most items below do not apply there.
Work top-to-bottom before exposing the dashboard outside your host. Each item is independently verifiable.
- Ran
./tools/gen-certs.shon the target machine (never reused from a package / shared drive). - SAN includes every hostname / IP users will access the dashboard through.
Example:
./tools/gen-certs.sh --host monitor.corp.local --ip 10.0.1.5. -
certs/server.keyhas mode0600and is owned by the service run user. -
certs/server.crthas mode0644and is owned by the service run user. - If using an organization CA cert, files are placed (or symlinked) as
certs/server.crt/certs/server.key.install.shrespects symlinks and will NOTchmodtheir targets. - Certificate expiry date is tracked (e.g. in a calendar reminder). Verify
with
openssl x509 -in certs/server.crt -noout -dates.
-
$INSTALL_DIR/data/is mode0700and owned byetcdmonitor:etcdmonitor. -
$INSTALL_DIR/data/etcdmonitor.dbis mode0600. -
$INSTALL_DIR/data/initial-admin-password(if still present) is mode0600. If present, change the admin password to make this file auto-destruct. -
$INSTALL_DIR/logs/is owned by the run user; log files are readable only by the run user. - No world-writable files anywhere under
$INSTALL_DIR:find $INSTALL_DIR -type f -perm -0002 -printreturns nothing.
-
systemctl status etcdmonitorshowsactive (running). - Service is NOT running as root. The default
install.shkeeps root for upgrade continuity, so for production you must explicitly run:sudo useradd -r -s /sbin/nologin -d <INSTALL_DIR> etcdmonitor sudo ./install.sh --run-user etcdmonitorVerify with:systemctl show -p User etcdmonitorreportsUser=etcdmonitor. -
journalctl -u etcdmonitor | grep -i 'running.*as root'is empty — confirming no WARN about root mode was emitted during the last install. -
systemd-analyze security etcdmonitorExposure value is ≤3.0 OK. Review any red entries and consider adding:ProtectKernelTunables=true,ProtectKernelModules=true,RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6. - Unit file (
/etc/systemd/system/etcdmonitor.service) contains all of:NoNewPrivileges=true,ProtectSystem=strict,ProtectHome=true,PrivateTmp=true,PrivateDevices=true,RestrictSUIDSGID=true,RestrictNamespaces=true,LockPersonality=true,MemoryDenyWriteExecute=true,CapabilityBoundingSet=,AmbientCapabilities=,ReadWritePaths=$INSTALL_DIR/data $INSTALL_DIR/logs.
- On first start, initial admin password was retrieved via
cat $INSTALL_DIR/data/initial-admin-passwordand then changed immediately via the dashboard login flow. -
$INSTALL_DIR/data/initial-admin-passwordno longer exists (auto-deleted after first password change). -
config.yamlauth.bcrypt_costis within [8, 14]. Default 10 is fine. -
config.yamlauth.lockout_thresholdandlockout_duration_secondsare appropriate for your ops team's tolerance. - Emergency recovery procedure is documented: operators know to run
etcdmonitor reset-password --username adminoretcdmonitor unlock --username adminfrom the service host. -
config.yamlserver.session_timeoutmatches your threat model. Default3600(1 hour) is the safe baseline. Setting it to0makes the dashboard session never expire — convenient for trusted workstations on a LAN, but a deliberate security tradeoff: a stolen browser cookie (or unattended workstation) keeps full dashboard access until the operator explicitly hits Logout, the service is restarted, oretcdmonitor unlockis used out-of-band. Negative values are rejected and silently fall back to3600with a WARN line on stderr at startup.
- Application logs rotate correctly (check
config.yamllog.max_size_mb/max_files/compress). -
ops_audit_logtable is being written (querySELECT COUNT(*) FROM ops_audit_logafter one login attempt). - Audit retention (
config.yamlops.audit_retention_days) matches your compliance requirements. - journald integration works:
journalctl -u etcdmonitor -n 50shows recent entries. - No plaintext passwords in logs:
grep -i password $INSTALL_DIR/logs/*.logreturns nothing except redaction notices.
-
curl -sI https://<host>:<port>/ | grep Content-Security-Policyshows the strict CSP (no'unsafe-eval', nocdn.jsdelivr.net). - Dashboard loaded successfully with no network connectivity to cdn.jsdelivr.net / unpkg (verifies echarts vendoring works offline).
-
grep -rn -E 'src="https?://|href="https?://' web/*.htmlreturns no results (only SVGxmlnsnamespaces allowed). - Firewall (
iptables/firewalld/ cloud SG) restricts dashboard port to the trusted operator network — NOT0.0.0.0/0. - If placing behind a reverse proxy, verify
X-Forwarded-Foris forwarded (for audit log client IPs).
- Before upgrade: backup
$INSTALL_DIR/data/etcdmonitor.dband/etc/systemd/system/etcdmonitor.service. - Upgrade procedure documented for operators:
systemctl stop etcdmonitor→ replace binary / configs →sudo ./install.sh→systemctl status etcdmonitor. - Rollback procedure tested at least once in a staging environment.
- SHA256 of deployed binary recorded (
sha256sum $INSTALL_DIR/etcdmonitorin a deployment notebook / CMDB).
If any item is unchecked, the deployment is not ready for production. Fix the item, do not add an exception.
This release does not fully eliminate the following exposures. They are tracked as follow-up work; operators should be aware of them.
-
CSP
script-src 'unsafe-inline'retained. The login page, change-password page, and dashboard main page currently rely on inline<script>blocks and inlineon*event handler attributes (~59 inindex.html). Removing'unsafe-inline'would break the dashboard immediately. The eventual fix (extract inline scripts to.jsfiles, migrate handlers toaddEventListener, adopt per-request nonces) is tracked for a future OpenSpec change. In the meantime, the primary XSS defense isescapeHTML()on all backend-data interpolation (seeweb/util.jsandweb/ops.js). -
CSP
style-src 'unsafe-inline'retained. The frontend uses many dynamicstyle=""attributes for theming and panel sizing. Removing it requires a styling-system refactor, out of scope here. -
Git history still contains the revoked example TLS key. See the Historical Advisories section in
SECURITY.md. The key is self-signed with no CA trust and cannot impersonate a CA-signed certificate, but operators must not reuse it.install.shenforces this by refusing to start iftls_enable: truewithout a freshly generated local cert. -
Default
install.shstill runs the service asroot. This keeps the upgrade path continuous with 0.8.x. For production deployments, switch to a dedicated user: see Section 3 above.
Each of these is a conscious tradeoff, not an oversight — review them against your threat model before deployment.