Deployment Guide
This comprehensive deployment guide covers all deployment scenarios for the InnoQualis Electronic Quality Management System (EQMS), from development to production environments. The system supports Docker Compose deployments with scalable architecture for different use cases.
Last Updated: October 24, 2025
Version: Phase 8 In Progress (Documentation Consolidation Complete)
Status: Production Ready
Deployment Scenarios​
Development Deployment​
- Purpose: Local development and testing
- Architecture: Single VM with Docker Compose
- Storage: Local file system
- Database: Local PostgreSQL container
- Monitoring: Basic logging
Staging Deployment​
- Purpose: Pre-production testing and validation
- Architecture: Single VM with production-like configuration
- Storage: Staging file system
- Database: Staging PostgreSQL instance
- Monitoring: Enhanced logging and basic monitoring
Production Deployment​
- Purpose: Live production environment
- Architecture: Scalable multi-container deployment
- Storage: Cloud storage (S3/GCP)
- Database: Production PostgreSQL cluster
- Monitoring: Full observability stack
Prerequisites​
- VM with:
- Docker Engine 24+ and Docker Compose v2
- 2 CPU, 4GB RAM minimum (pilot)
- Ports 80/443 open if using a reverse proxy (optional for MVP)
- Domain and TLS (optional MVP; can terminate HTTPS behind a separate reverse proxy like Caddy/Nginx or a managed load balancer)
- OpenAI API key (optional if AI features used)
- SendGrid SMTP credentials (pilot outbound email)
Directory Structure​
- backend/
- frontend/
- docker-compose.yml (dev default)
- docker/docker-compose.prod.yml (this guide references the production file to be added)
- backend/.env.example (use to create env files)
- docs/
Environments and Configuration​
We use environment-specific files with Docker Compose override strategy.
- Development:
- Use root docker-compose.yml
- Bind mounts, hot reload, localhost ports
- Staging/Production:
- Use docker/docker-compose.prod.yml
- No bind mounts for app code
- Health checks, restart policies, pinned images, smaller images
- Persistent volumes
Create environment files:
- docker/.env.staging.example
- docker/.env.prod.example
Copy to actual secrets:
- .env.staging (do not commit)
- .env.prod (do not commit)
Required environment variables (superset for forward compatibility):
- DATABASE_URL
- SECRET_KEY
- OPENAI_API_KEY (if AI features enabled)
- STORAGE_TYPE: local | s3 | gcp
- SMTP_HOST, SMTP_PORT, SMTP_USER, SMTP_PASSWORD, SMTP_FROM
- AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_S3_BUCKET (future)
- GCP_PROJECT_ID, GCP_BUCKET_NAME, GOOGLE_APPLICATION_CREDENTIALS (future)
- BACKEND_CORS_ORIGINS (comma-separated, e.g. https://innoqualis.example.com)
- BACKEND_WORKERS (Uvicorn workers hint)
- LOG_LEVEL (info, warning, error)
Note: For pilot, set STORAGE_TYPE=local and configure volumes.
Production Compose (pilot)​
File: docker/docker-compose.prod.yml
- Services: db, backend, frontend
- Volumes:
- postgres_data (named volume)
- backend_uploads (documents)
- backend_chroma (vector db)
- Healthchecks:
- db: pg_isready
- backend: GET /health
- frontend: GET /
- Restart Policy: always
- Resource limits (optional): cpus, memory reservations for stability
Later we can add a reverse proxy for HTTPS termination; for MVP internal pilot on a secured network, HTTP is acceptable with risk acceptance documented.
Step-by-Step Deployment​
- Create the .env files
- Copy docker/.env.prod.example to docker/.env.prod.
- Update values:
- SECRET_KEY: random 32+ chars
- DATABASE_URL: postgresql://eqms_user:eqms_password@db/eqms_db
- SMTP_HOST=smtp.sendgrid.net, SMTP_PORT=587, SMTP_USER=apikey, SMTP_PASSWORD=<SENDGRID_API_KEY>, SMTP_FROM=innoqualis@yourdomain
- STORAGE_TYPE=local
- BACKEND_CORS_ORIGINS=https://your-frontend-domain (or http://localhost for pilot)
- Build and start
- From the repo root:
- docker compose -f docker/docker-compose.prod.yml --env-file docker/.env.prod build
- docker compose -f docker/docker-compose.prod.yml --env-file docker/.env.prod up -d
- Validate services
- Backend:
curl http://your-host:8000/healthreturns{"status":"healthy"} - Frontend: open http://your-host:3000
- API docs: http://your-host:8000/docs
- Create initial admin (if applicable)
- Use the auth endpoints or seed script as documented in README.
- Ensure email deliverability with SendGrid credentials.
Security Hardening​
- Secrets:
- Use docker .env files not committed to git.
- Consider Docker secrets or environment providers later.
- Sanitize logs and avoid printing secrets.
- HTTPS:
- MVP: plaintext acceptable on secure network.
- Recommended next step: place Caddy/Nginx/Traefik in front for TLS termination, HSTS, and security headers (X-Content-Type-Options, X-Frame-Options, Referrer-Policy).
- CORS:
- In backend, configure BACKEND_CORS_ORIGINS (comma separated) and reflect in FastAPI middleware.
- For multi-domain frontends, set exact origins; avoid wildcard in production.
- JWT:
- Rotate SECRET_KEY on a schedule and during incident response.
- Configure appropriate token expirations (if implemented).
- Revoke tokens after key rotation by forcing re-login.
- OS/User:
- Backend and frontend images run as non-root in production builds (see optimized Dockerfiles).
- Restrict container capabilities (default is fine for MVP).
- Dependencies:
- Use slim base images and rebuild regularly to pick up security patches.
- Scan images with a container scanner (e.g., Trivy) as part of CI (future).
- Network:
- Use a user-defined bridge network (Compose default) to isolate services.
- Expose only required ports publicly (3000 and 8000 for pilot).
Monitoring and Logging​
- Health Checks:
- Backend /health endpoint
- Compose healthcheck with curl
- Logs:
- Docker logs for backend and frontend
- Configure FastAPI log level via LOG_LEVEL
- Metrics (optional next step):
- Expose Prometheus metrics (future).
- OpenTelemetry tracing hooks (future).
- Uptime:
- Use external uptime monitor to ping /health and frontend /.
- Validation of health and readiness:
- After
up -d, wait for service health:- docker compose -f docker/docker-compose.prod.yml ps
curl http://<host>:8000/health → {"status":"healthy"}
- Frontend readiness check:
curl -I http://<host>:3000should return HTTP/1.1 200
- Add these to your post-deploy checklist execution record.
- After
Backups and Recovery​
- Database (Postgres):
- Nightly pg_dump to a local volume (cron outside compose or a simple backup sidecar). Example cron on host:
- 0 2 * * * docker exec -t eqms-db pg_dump -U eqms_user eqms_db > /backups/eqms_$(date +%F).sql
- Retain at least 7 days; periodically test restores on a separate environment.
- Nightly pg_dump to a local volume (cron outside compose or a simple backup sidecar). Example cron on host:
- Documents:
- The backend_uploads volume contains user documents. Include it in VM-level backups or rsync to a secondary disk.
- Chroma:
- backend_chroma volume; optional, can be re-built from documents if needed, but back it up for faster recovery.
Recovery Steps:
- Stop services
- Restore database from latest dump
- Restore uploads and chroma volumes
- Start services and run integrity checks (application-level validation)
Performance Tuning​
- Backend:
- BACKEND_WORKERS: Set based on CPU cores (e.g., 2-4).
- DB pool settings (if configured in SQLAlchemy): pool_size, max_overflow.
- LOG_LEVEL=info in production; avoid debug to reduce I/O.
- Frontend:
- NODE_ENV=production
- Next.js caching and image optimization (if used)
- Postgres:
- For small pilot loads, defaults are acceptable.
- Environment variables reference:
- BACKEND_WORKERS: integer worker count
- LOG_LEVEL: info|warning|error
- NODE_ENV: production
- BACKEND_CORS_ORIGINS: comma-separated origins
- DATABASE_URL: postgresql connection string
Validation Procedures​
Pre-deploy checklist (operations-checklists/pre-deploy.md):
- Env files present and validated (no placeholders).
- Volumes created.
- Secrets set.
- Ports available.
Post-deploy validation (operations-checklists/post-deploy-validation.md):
- Backend /health 200 OK.
- Login success.
- Document upload/download success.
- Training, deviations, CAPA, audit endpoints basic sanity checks.
Backup-restore test (operations-checklists/backup-restore.md):
- Execute a test dump and simulate restore to staging.
- Verify user login and minimal workflow success.
Troubleshooting​
See docs/troubleshooting.md for common issues and resolutions.
Next Steps​
- Add reverse proxy and TLS termination.
- Migrate documents to S3 or GCP bucket with server-side encryption and IAM scoped credentials.
- Add centralized logging and metrics.