4.4 KiB
4.4 KiB
Draft: ZUGFeRD-Service Implementation
Requirements (confirmed)
Core Functionality
- Purpose: Python REST API for ZUGFeRD/Factur-X invoice extraction and validation
- Framework: FastAPI (preferred by user)
- Runtime: Python 3.11+
- Deployment: Docker container on NixOS server + native Nix package
API Endpoints
GET /health- Health check endpointPOST /extract- PDF extraction (accepts base64-encoded PDF)POST /validate- Invoice validation (pflichtfelder, betraege, ustid, pdf_abgleich)
Key Dependencies
factur-x>=2.5- ZUGFeRD/Factur-X extractionpypdf>=4.0.0- PDF text extractionfastapi>=0.109.0- API frameworkuvicorn>=0.27.0- ASGI serverpydantic>=2.5.0- Data modelslxml>=5.0.0- XML parsingpython-multipart>=0.0.6- File uploads
Project Structure (user-specified)
zugferd-service/
├── Dockerfile
├── requirements.txt
├── README.md
├── src/
│ ├── __init__.py
│ ├── main.py # FastAPI App + Endpoints
│ ├── extractor.py # ZUGFeRD/PDF Extraktion
│ ├── validator.py # Validierungslogik
│ ├── pdf_parser.py # PDF-Text-Parsing für Abgleich
│ ├── models.py # Pydantic Models
│ └── utils.py # Hilfsfunktionen
├── tests/
│ ├── __init__.py
│ ├── test_extractor.py
│ ├── test_validator.py
│ └── fixtures/
│ ├── sample_zugferd.pdf
│ └── sample_no_zugferd.pdf
└── docker-compose.yml
Research Findings
Nix Packaging (from librarian research)
- Use
buildPythonApplicationfor standalone service pyproject = truewith hatchling/setuptoolspythonRelaxDeps = truefor dependency flexibility- mem0 example pattern: custom server script via
postInstall - Consider flake.nix for modern Nix workflow
factur-x Library (from librarian research)
get_xml_from_pdf()- Core extraction functionget_level()/get_flavor()- Profile detection- Namespaces: rsm, ram, udt for UN/CEFACT CII format
- Profile levels: minimum, basicwl, basic, en16931, extended
UN/ECE Unit Codes
- C62 = Piece, KGM = Kilogram, H87 = Piece (alt)
- Need comprehensive mapping dictionary
Technical Decisions
Python Tooling
- PENDING: Use pyproject.toml (modern) or requirements.txt (legacy)?
- PENDING: Build system: setuptools, hatchling, or poetry-core?
Nix Approach
- PENDING: Flake-based or traditional Nix expressions?
- PENDING: Include NixOS service module?
Testing Strategy
- PENDING: TDD or tests-after?
- PENDING: Test framework: pytest (standard choice)
Scope Boundaries
INCLUDE
- All 3 API endpoints as specified
- All validation checks (pflichtfelder, betraege, ustid, pdf_abgleich)
- Docker multi-stage build
- Nix packaging
- Basic test suite
- README documentation
EXCLUDE
- Online USt-ID validation (only format check)
- Database/persistence (stateless service)
- Authentication/authorization
- Rate limiting
- Metrics/tracing
Open Questions (RESOLVED)
- ✅ Python project structure: pyproject.toml with hatchling
- ✅ Build system: hatchling (modern, Nix-friendly)
- ✅ Nix approach: Flake-based
- ✅ Testing: TDD (test-first) with pytest
- ✅ Sample PDFs: Source from official ZUGFeRD repositories
Metis Gap Analysis (Reviewed)
Gaps Classified as MINOR (Auto-Resolved)
- UN/ECE unit codes: Start with common codes (C62, KGM, H87, MTR, LTR, etc.), expand as needed
- Tolerance: Hardcode 0.01 EUR as specified
- Validation scope: Check required fields exist for declared profile
- Error codes: Implement as specified in user's detailed spec
Gaps Classified as DEFAULTS APPLIED
- Authentication: OPEN (no auth mentioned in spec → stateless public API)
- ZUGFeRD profiles: ALL profiles supported (MINIMUM, BASIC, BASIC WL, EN16931, EXTENDED)
- Deployment: Container-based on NixOS (as per NixOS config section in spec)
- PDF text extraction: REQUIRED for pdf_abgleich check (explicitly in spec)
- File size limit: Handle as error for >10MB (spec mentions this edge case)
Guardrails (Must NOT Have)
- NO authentication middleware
- NO database/persistence
- NO caching layers
- NO rate limiting
- NO metrics endpoints (beyond /health)
- NO CLI interface
- NO web UI
- NO abstraction layers for "future extensibility"