Unified Troubleshooting Guide - InnoQualis EQMS

This comprehensive troubleshooting guide consolidates all issue resolution steps for the InnoQualis Electronic Quality Management System (EQMS). It covers deployment issues, development problems, testing failures, authentication issues, and production troubleshooting.

Last Updated: October 24, 2025
Version: Phase 8 In Progress (Documentation Consolidation Complete)
Status: Production Ready

Quick Reference

Containers failing healthchecks

Symptoms:

docker ps shows unhealthy state for backend or frontend
docker compose logs show repeated healthcheck failures

Checks:

Logs: docker compose -f docker/docker-compose.prod.yml logs -f backend
Health endpoint: curl http://localhost:8000/health
Ports open: ensure 8000/3000 accessible locally

Fixes:

Verify env file path passed via --env-file docker/.env.prod
Rebuild images: docker compose -f docker/docker-compose.prod.yml --env-file docker/.env.prod build --no-cache
Ensure db is healthy first (depends_on condition service_healthy)
Increase healthcheck retries/interval if the VM is slow to start

Backend cannot connect to database

Symptoms:

Backend logs show connection refused or authentication errors
HTTP 500 on API endpoints

Checks:

DATABASE_URL matches POSTGRES_* settings in env
DB health: docker compose logs db; pg_isready in healthcheck should pass
psql test inside container:
- docker exec -it eqms-db psql -U eqms_user -d eqms_db -c "select 1;"

Fixes:

Align POSTGRES_DB/USER/PASSWORD and DATABASE_URL
Remove dangling postgres_data volume if credentials changed intentionally:
- docker compose down
- docker volume rm docker_postgres_data (exact name may differ: use docker volume ls)
- docker compose up -d
Avoid accidental whitespace or URL encoding issues in DATABASE_URL

CORS errors in browser

Symptoms:

Browser console: Access-Control-Allow-Origin missing
Requests blocked

Checks:

BACKEND_CORS_ORIGINS includes your frontend origin (http://host:3000 or https://your-domain)
FastAPI middleware configured (see backend/app/main.ts or equivalent)

Fixes:

Update BACKEND_CORS_ORIGINS in env file (comma-separated if multiple)
Restart backend: docker compose up -d backend

Email not sending (SendGrid)

Symptoms:

No email received; backend logs show SMTP auth or TLS errors

Checks:

SMTP_HOST=smtp.sendgrid.net, SMTP_PORT=587, SMTP_USER=apikey, SMTP_PASSWORD set to SendGrid API key
SMTP_FROM domain is authorized in SendGrid (SPF/DKIM for non-sandbox)

Fixes:

Correct credentials and from address
Test outbound connectivity from VM: nc -vz smtp.sendgrid.net 587
Use a sandbox key in staging to avoid production sends

File uploads/downloads failing

Symptoms:

500 on upload; 404 or permission denied on download

Checks:

Volume mounted: backend_uploads volume mapped to /app/uploads in backend
Directory permissions within container (writable by app user)
Disk space on VM: df -h

Fixes:

Ensure volume exists and container user has write permission
Recreate backend container to pick up correct mount:
- docker compose up -d --force-recreate backend
Verify file size and type limits (if enforced)

High CPU or memory usage

Symptoms:

Containers OOMKilled or slow responses

Checks:

docker stats
Log verbosity (LOG_LEVEL) set too high?
BACKEND_WORKERS too high for small VM?

Fixes:

Reduce BACKEND_WORKERS to 2 on small VM
Set LOG_LEVEL=info
Add compose resource limits (deploy resources or mem_limit/cpu_shares if supported)

Login/authentication failures

Symptoms:

401 Unauthorized despite correct credentials

Checks:

SECRET_KEY set and consistent across restarts
Token not expired (client clock skew)
Role assignment for user

Fixes:

Rotate SECRET_KEY only with planned downtime; forces re-login
Check user record and role
Review backend logs for specific cause

AI assistant errors (OpenAI)

Symptoms:

500 AI service error or timeout

Checks:

OPENAI_API_KEY defined
Outbound internet access from VM
Rate limits exceeded

Fixes:

Set valid API key or leave empty to disable AI features
Add retry/backoff client-side
Consider disabling AI during pilot if not required

Volumes/permissions issues

Symptoms:

Backend cannot write uploads or chroma db

Checks:

Volume mounts defined in docker/docker-compose.prod.yml
Container user permissions on /app/uploads and /app/chroma_db

Fixes:

Adjust permissions inside container:
- docker exec -it eqms-backend sh -c "mkdir -p /app/uploads /app/chroma_db && chmod -R 775 /app/uploads /app/chroma_db"
Ensure host filesystem isn’t mounted read-only

Backup/restore problems

Symptoms:

pg_dump fails, file empty, or restore errors

Checks:

Cron path permissions for dump location
pg_dump available inside container (official Postgres image includes it)
Database size and disk space

Fixes:

Write dumps to a dedicated directory with sufficient space
Test restore on staging before relying on backups

Frontend cannot reach backend

Symptoms:

502/failed fetch from frontend pages

Checks:

Frontend points to backend via relative /api or direct host:port
In Next.js, ensure rewrites or environment are correct in production
Compose: frontend depends_on backend is healthy

Fixes:

If hosting behind a reverse proxy later, update NEXT_PUBLIC_API_BASE_URL and CORS accordingly
For pilot, ensure frontend requests go to http://<host>:8000

Quick Reference​

Emergency Issues​

Development Issues​

Production Issues​

Containers failing healthchecks​

Backend cannot connect to database​

CORS errors in browser​

Email not sending (SendGrid)​

File uploads/downloads failing​

High CPU or memory usage​

Login/authentication failures​

AI assistant errors (OpenAI)​

Volumes/permissions issues​

Backup/restore problems​

Frontend cannot reach backend​

Quick Reference

Emergency Issues

Development Issues

Production Issues

Containers failing healthchecks

Backend cannot connect to database

CORS errors in browser

Email not sending (SendGrid)

File uploads/downloads failing

High CPU or memory usage

Login/authentication failures

AI assistant errors (OpenAI)

Volumes/permissions issues

Backup/restore problems

Frontend cannot reach backend