Administrator Guide
This guide covers environment setup, configuration, RBAC management, monitoring, backups, and routine maintenance for the EQMS MVP deployed via Docker Compose on a single VM.
Audience: System administrators, DevOps, and QA leads responsible for operations.
1. System Overview​
- Single VM running Docker Compose
- Services:
- Postgres (db)
- FastAPI backend (backend)
- Next.js frontend (frontend)
- Persistent data:
- postgres_data (DB)
- backend_uploads (documents)
- backend_chroma (vector index)
- Email: SendGrid SMTP (pilot)
- Future: S3/GCP buckets for documents (placeholders included)
2. Environment Configuration​
Files:
- docker/docker-compose.prod.yml
- docker/.env.prod (copy from docker/.env.prod.example and fill)
Key variables:
- DATABASE_URL: postgresql://eqms_user:eqms_password@db/eqms_db
- SECRET_KEY: long random string
- STORAGE_TYPE: local (pilot)
- SMTP_HOST, SMTP_PORT, SMTP_USER, SMTP_PASSWORD, SMTP_FROM (SendGrid)
- BACKEND_CORS_ORIGINS: your frontend domain(s)
- BACKEND_WORKERS: integer (2-4 typical for small VM)
Create env file and validate:
- cp docker/.env.prod.example docker/.env.prod
- Edit docker/.env.prod with secrets (do not commit)
3. Deployment​
Build images:
- docker compose -f docker/docker-compose.prod.yml --env-file docker/.env.prod build
Start services:
- docker compose -f docker/docker-compose.prod.yml --env-file docker/.env.prod up -d
Check health:
- docker ps (ensure all services healthy)
- curl http://<host>:8000/health → {"status":"healthy"}
- Visit frontend at http://<host>:3000
Logs:
- docker compose -f docker/docker-compose.prod.yml logs -f backend
- docker compose -f docker/docker-compose.prod.yml logs -f db
- docker compose -f docker/docker-compose.prod.yml logs -f frontend
Stop/Restart:
- docker compose -f docker/docker-compose.prod.yml down
- docker compose -f docker/docker-compose.prod.yml up -d
4. RBAC Management​
Roles are seeded with permissions aligned to modules:
- User: minimal read + training complete
- QA: approvals, updates, reports
- Auditor: read-only + export
- Admin: user and role management
Typical tasks:
- Create user accounts (via API or admin page if present)
- Assign roles (via API endpoints under /roles/)
- Verify least privilege (attempt restricted actions with test accounts)
RBAC maintenance:
- On role changes, re-run a small validation matrix (login, view docs, attempt restricted action should fail).
5. Secrets Management​
- Store secrets in docker/.env.prod
- File permissions: chmod 600 docker/.env.prod
- Rotation:
- SECRET_KEY rotation invalidates existing JWTs
- Update .env.prod and restart backend
- Future: migrate to Docker secrets or a secret manager (AWS/GCP Vaults)
6. CORS and HTTPS​
- Set BACKEND_CORS_ORIGINS to the frontend domain(s).
- MVP may run on HTTP for internal testing.
- Production hardening: add a reverse proxy (Caddy/Nginx/Traefik) to terminate HTTPS and enforce security headers.
7. Monitoring and Logging​
Healthchecks:
- Compose healthchecks for db/backend/frontend
- External uptime monitoring recommended (check /health and frontend /)
Logs:
- docker logs are sufficient for MVP
- Backend LOG_LEVEL controls verbosity
- Future: centralize logs (ELK, Loki, Cloud Logging)
Metrics/Tracing (Roadmap):
- Prometheus/OpenTelemetry hooks can be added later
8. Backups and Recovery​
Database backups (pg_dump):
- Example host cron (adjust container name and paths):
- 0 2 * * * docker exec -t eqms-db pg_dump -U eqms_user eqms_db > /var/backups/eqms_db_$(date +%F).sql
- Retention: keep 7-30 days per policy
Volumes to back up:
- backend_uploads (documents)
- backend_chroma (optional but recommended)
- postgres_data (or rely on pg_dump files for logical backup)
Restore procedure:
- Stop services: docker compose -f docker/docker-compose.prod.yml down
- Restore volumes from backup (documents/chroma)
- Restore DB:
- docker compose -f docker/docker-compose.prod.yml up -d db
- docker exec -i eqms-db psql -U eqms_user -d eqms_db < /path/to/backup.sql
- Start all services: docker compose -f docker/docker-compose.prod.yml up -d
- Validate with post-deploy checklist
Disaster recovery test:
- At least quarterly, simulate restore on staging.
9. Performance Tuning​
Backend:
- BACKEND_WORKERS: set ~2-4 for small VM
- SQLAlchemy pool params (if configured): tune pool_size and max_overflow for DB load
Frontend:
- NODE_ENV=production
- Avoid dev logging in production
Postgres:
- For low load, defaults are fine. For higher load, tune shared_buffers, work_mem, etc. as needed.
10. Maintenance Tasks​
- Apply security updates by rebuilding images periodically
- Rotate secrets per policy
- Review audit trail periodically for anomalies
- Verify backups complete and are restorable
- Monitor disk usage of volumes (particularly uploads and DB)
11. Email Configuration (SendGrid)​
- SMTP_HOST=smtp.sendgrid.net
- SMTP_PORT=587
- SMTP_USER=apikey
- SMTP_PASSWORD=<API_KEY>
- SMTP_FROM=innoqualis@yourdomain
Validation:
- Trigger a test notification (if feature available) or send a minimal email via custom admin endpoint/script.
12. S3/GCP (Future)​
- STORAGE_TYPE=s3 or gcp
- Configure bucket name and credentials:
- AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_S3_BUCKET
- GCP_PROJECT_ID, GCP_BUCKET_NAME, GOOGLE_APPLICATION_CREDENTIALS
- Enable server-side encryption and least-privilege IAM roles
- Update documentation and DR procedures accordingly
13. Validation and Checklists​
Use the following operational checklists:
- docs/operations-checklists/pre-deploy.md
- docs/operations-checklists/post-deploy-validation.md
- docs/operations-checklists/backup-restore.md