build(docker): add integration tests, Dockerfile, and docker-compose for packaging
This commit is contained in:
@@ -508,3 +508,159 @@ async def http_exception_handler(request: Request, exc: HTTPException):
|
||||
- ExtractionError, HTTPException, and generic Exception handlers all follow this pattern
|
||||
- Test `test_extract_invalid_base64` expects this flat format
|
||||
|
||||
## [2026-02-04T21:30:00.000Z] Task 13: Integration Tests Implementation
|
||||
|
||||
### Integration Test Patterns
|
||||
- Tests full workflow: POST /extract → get xml_data → POST /validate with xml_data
|
||||
- Uses real sample PDFs from tests/fixtures/
|
||||
- Validates end-to-end behavior across multiple components
|
||||
- Tests multiple scenarios: different profiles, errors, edge cases
|
||||
|
||||
### Test Categories Implemented
|
||||
1. **Full workflow tests**: 3 tests covering EN16931, BASIC WL, EXTENDED profiles
|
||||
2. **Error scenarios**: Invalid base64, non-ZUGFeRD PDF, corrupt data
|
||||
3. **Validation combinations**: Different check combinations, empty checks list
|
||||
4. **Sequential testing**: Multiple PDFs in sequence to check state pollution
|
||||
5. **Edge cases**: Empty xml_data from non-ZUGFeRD PDF
|
||||
|
||||
### Helper Function Pattern
|
||||
- Created `read_pdf_as_base64(filepath)` helper to reduce code duplication
|
||||
- Reads PDF, encodes as base64 string
|
||||
- Used across all integration tests for PDF preparation
|
||||
|
||||
### Test Count and Coverage
|
||||
- 9 integration tests created (exceeds requirement of 5+ tests)
|
||||
- All tests follow pytest conventions with descriptive docstrings
|
||||
- All sample PDF types from MANIFEST.md covered
|
||||
|
||||
### Error Response Validation
|
||||
- Integration tests verify error responses use flat format: `{"error": "code", "message": "..."}`
|
||||
- Tests verify correct HTTP status codes (400 for errors, 200 for success)
|
||||
|
||||
### Validation Response Structure
|
||||
- Validates nested "result" field in ValidateResponse
|
||||
- Checks for "is_valid", "errors", "warnings" fields
|
||||
- Verifies summary and validation_time_ms fields
|
||||
|
||||
### Pre-commit Hook on Comments
|
||||
- Removed unnecessary inline comments (# Step 1, etc.)
|
||||
- Code structure is self-documenting
|
||||
- Test docstrings kept for pytest output readability (per inherited wisdom)
|
||||
|
||||
### Syntax Verification
|
||||
- Used `python -m py_compile tests/test_integration.py` for syntax check
|
||||
- Nix environment limitation: cannot install pytest, use py_compile instead
|
||||
- File compiles successfully without errors
|
||||
|
||||
### Docstring Justification
|
||||
- Test function docstrings: pytest uses these in test reports (essential for readability)
|
||||
- Module docstring: documents purpose of integration test file
|
||||
- Helper function docstring: documents args and returns (utility function pattern)
|
||||
- All inline comments removed - code speaks for itself
|
||||
|
||||
### API Contract Testing
|
||||
- Integration tests verify the API contract between endpoints
|
||||
- Extract endpoint returns expected structure (is_zugferd, xml_data, pdf_text)
|
||||
- Validate endpoint accepts xml_data and returns ValidationResult
|
||||
- Both endpoints use correct HTTP status codes
|
||||
|
||||
### Sample PDF Selection
|
||||
- EN16931_Einfach.pdf: Standard EN16931 profile
|
||||
- validAvoir_FR_type380_BASICWL.pdf: BASIC WL profile (French credit note)
|
||||
- zugferd_2p1_EXTENDED_PDFA-3A.pdf: EXTENDED profile with PDF/A-3A
|
||||
- EmptyPDFA1.pdf: Non-ZUGFeRD PDF for negative testing
|
||||
|
||||
### Test Naming Convention
|
||||
- Pattern: `test_integration_<description>_workflow` for workflow tests
|
||||
- Pattern: `test_integration_<scenario>` for specific scenario tests
|
||||
- Descriptive names that clearly indicate test purpose
|
||||
|
||||
## [2026-02-04T21:35:00.000Z] Task 15: Docker Compose Configuration
|
||||
|
||||
### Docker Compose for Local Development
|
||||
- Single service stateless application (no database, cache, or external dependencies)
|
||||
- Service named `zugferd-service` matches project name
|
||||
- Port mapping 5000:5000 for uvicorn default port
|
||||
- Read-only volume mount: `./src:/app/src:ro` enables live reload during development
|
||||
- Health check uses curl against /health endpoint (requires curl in Dockerfile)
|
||||
- Restart policy: `unless-stopped` for development convenience
|
||||
|
||||
### Volume Mount Configuration
|
||||
- Mounts src directory for live reload
|
||||
- Read-only mode (`:ro`) prevents accidental modifications from within container
|
||||
- Allows code changes on host to immediately reflect in running container
|
||||
- Only src directory mounted (no other directories needed for stateless service)
|
||||
|
||||
### Health Check Pattern
|
||||
- Simple HTTP GET to /health endpoint
|
||||
- Interval: 30s (frequency of health checks)
|
||||
- Timeout: 10s (time to wait before marking check as failed)
|
||||
- Retries: 3 (consecutive failures before marking unhealthy)
|
||||
- Start period: 10s (grace period on container start before health checks begin)
|
||||
- Uses curl command (must be installed in Docker image)
|
||||
|
||||
### Environment Variables
|
||||
- LOG_LEVEL=INFO for structured JSON logging
|
||||
- Can be extended for other configuration (e.g., host, port, etc.)
|
||||
- No secrets or authentication configuration (open endpoints)
|
||||
|
||||
### Docker Compose Version
|
||||
- Uses version '3.8' (stable, widely supported)
|
||||
- Compatible with Docker Compose v1 and v2
|
||||
|
||||
## [2026-02-04T20:20:00.000Z] Task 14: Dockerfile Creation
|
||||
|
||||
### Multi-Stage Docker Build Pattern
|
||||
- Builder stage: Install build dependencies (build-essential), build wheel with hatchling
|
||||
- Production stage: Copy only runtime dependencies from builder, use slim base image
|
||||
- Key benefit: Final image doesn't include build tools (gcc, make, etc.)
|
||||
- Reduced image size: 162 MB (well under 500 MB requirement)
|
||||
|
||||
### Dockerfile Structural Comments
|
||||
- Dockerfiles don't have functions or classes to organize code
|
||||
- Section comments (# Build stage, # Production stage) are necessary for readability
|
||||
- These comments follow Docker best practices and are essential for maintainability
|
||||
- Unlike code comments, Dockerfile comments serve as structural markers
|
||||
|
||||
### .dockerignore Pattern
|
||||
- Exclude .git, __pycache__, dist/, build/, venv/ directories
|
||||
- Exclude test files, documentation, CI/CD configs
|
||||
- Exclude Nix-specific files (result/, .direnv/, .sisyphus/)
|
||||
- Reduces build context size and excludes unnecessary files from image
|
||||
|
||||
### Python Package Installation Pattern
|
||||
- Use `pip install --prefix=/install dist/*.whl` to install to custom location
|
||||
- Copy `/install` directory to `/usr/local` in production stage
|
||||
- Separates build artifacts from installation directory
|
||||
- Cleaner separation than copying site-packages directly
|
||||
|
||||
### Non-Root User Setup
|
||||
- Create user: `useradd -m -r appuser`
|
||||
- `-m` creates home directory, `-r` creates system user (no password)
|
||||
- Change ownership: `chown -R appuser:appuser /app`
|
||||
- Switch to non-root: `USER appuser` before exposing port and CMD
|
||||
|
||||
### uvicorn CMD Pattern
|
||||
- Use array format: `CMD ["uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "5000"]`
|
||||
- Array format prevents shell parsing issues
|
||||
- Host 0.0.0.0 binds to all interfaces (required for Docker)
|
||||
- Port 5000 matches EXPOSE directive
|
||||
|
||||
### Container Testing Strategy
|
||||
- Use `docker exec` to test from inside container when host networking fails
|
||||
- Python built-in urllib.request works when curl not installed
|
||||
- Internal test: `python -c "import urllib.request; print(urllib.request.urlopen('http://localhost:5000/health').read().decode())"`
|
||||
- Validates service runs correctly regardless of host port forwarding issues
|
||||
|
||||
### Image Size Optimization
|
||||
- Python 3.11-slim base image: ~120 MB
|
||||
- Application dependencies: ~40 MB (fastapi, uvicorn, factur-x, pypdf, lxml, pydantic)
|
||||
- Total: 162 MB (excellent for Python FastAPI service)
|
||||
- Multi-stage build eliminates ~200 MB of build tools
|
||||
|
||||
### Docker Build Verification
|
||||
- Build: `docker build -t zugferd-service:test .`
|
||||
- Size check: `docker images zugferd-service:test --format "{{.Size}}"`
|
||||
- Run container: `docker run -d --name test -p 5000:5000 zugferd-service:test`
|
||||
- Test health: Use internal curl or Python when host port forwarding problematic
|
||||
|
||||
|
||||
Reference in New Issue
Block a user