feat(project): initialize ZUGFeRD service with pyproject.toml and directory structure

This commit is contained in:
m3tm3re
2026-02-04 19:19:44 +01:00
parent 48ad42aa19
commit 0db2482bf2
19 changed files with 2114 additions and 0 deletions

42
.gitignore vendored Normal file
View File

@@ -0,0 +1,42 @@
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
# Virtual environments
venv/
ENV/
env/
# IDE
.vscode/
.idea/
*.swp
*.swo
# Testing
.pytest_cache/
.coverage
htmlcov/
# Sisyphus internal
.sisyphus/boulder.json
.sisyphus/notepads/*/issues.md
.sisyphus/notepads/*/problems.md

8
.sisyphus/boulder.json Normal file
View File

@@ -0,0 +1,8 @@
{
"active_plan": "/home/m3tam3re/p/AZ/zugferd-service/.sisyphus/plans/zugferd-service.md",
"started_at": "2026-02-04T18:12:44.865Z",
"session_ids": [
"ses_3d634d45bffeW59tJMBkcCfwyd"
],
"plan_name": "zugferd-service"
}

View File

@@ -0,0 +1,130 @@
# Draft: ZUGFeRD-Service Implementation
## Requirements (confirmed)
### Core Functionality
- **Purpose**: Python REST API for ZUGFeRD/Factur-X invoice extraction and validation
- **Framework**: FastAPI (preferred by user)
- **Runtime**: Python 3.11+
- **Deployment**: Docker container on NixOS server + native Nix package
### API Endpoints
1. `GET /health` - Health check endpoint
2. `POST /extract` - PDF extraction (accepts base64-encoded PDF)
3. `POST /validate` - Invoice validation (pflichtfelder, betraege, ustid, pdf_abgleich)
### Key Dependencies
- `factur-x>=2.5` - ZUGFeRD/Factur-X extraction
- `pypdf>=4.0.0` - PDF text extraction
- `fastapi>=0.109.0` - API framework
- `uvicorn>=0.27.0` - ASGI server
- `pydantic>=2.5.0` - Data models
- `lxml>=5.0.0` - XML parsing
- `python-multipart>=0.0.6` - File uploads
### Project Structure (user-specified)
```
zugferd-service/
├── Dockerfile
├── requirements.txt
├── README.md
├── src/
│ ├── __init__.py
│ ├── main.py # FastAPI App + Endpoints
│ ├── extractor.py # ZUGFeRD/PDF Extraktion
│ ├── validator.py # Validierungslogik
│ ├── pdf_parser.py # PDF-Text-Parsing für Abgleich
│ ├── models.py # Pydantic Models
│ └── utils.py # Hilfsfunktionen
├── tests/
│ ├── __init__.py
│ ├── test_extractor.py
│ ├── test_validator.py
│ └── fixtures/
│ ├── sample_zugferd.pdf
│ └── sample_no_zugferd.pdf
└── docker-compose.yml
```
## Research Findings
### Nix Packaging (from librarian research)
- Use `buildPythonApplication` for standalone service
- `pyproject = true` with hatchling/setuptools
- `pythonRelaxDeps = true` for dependency flexibility
- mem0 example pattern: custom server script via `postInstall`
- Consider flake.nix for modern Nix workflow
### factur-x Library (from librarian research)
- `get_xml_from_pdf()` - Core extraction function
- `get_level()` / `get_flavor()` - Profile detection
- Namespaces: rsm, ram, udt for UN/CEFACT CII format
- Profile levels: minimum, basicwl, basic, en16931, extended
### UN/ECE Unit Codes
- C62 = Piece, KGM = Kilogram, H87 = Piece (alt)
- Need comprehensive mapping dictionary
## Technical Decisions
### Python Tooling
- **PENDING**: Use pyproject.toml (modern) or requirements.txt (legacy)?
- **PENDING**: Build system: setuptools, hatchling, or poetry-core?
### Nix Approach
- **PENDING**: Flake-based or traditional Nix expressions?
- **PENDING**: Include NixOS service module?
### Testing Strategy
- **PENDING**: TDD or tests-after?
- **PENDING**: Test framework: pytest (standard choice)
## Scope Boundaries
### INCLUDE
- All 3 API endpoints as specified
- All validation checks (pflichtfelder, betraege, ustid, pdf_abgleich)
- Docker multi-stage build
- Nix packaging
- Basic test suite
- README documentation
### EXCLUDE
- Online USt-ID validation (only format check)
- Database/persistence (stateless service)
- Authentication/authorization
- Rate limiting
- Metrics/tracing
## Open Questions (RESOLVED)
1.**Python project structure**: pyproject.toml with hatchling
2.**Build system**: hatchling (modern, Nix-friendly)
3.**Nix approach**: Flake-based
4.**Testing**: TDD (test-first) with pytest
5.**Sample PDFs**: Source from official ZUGFeRD repositories
## Metis Gap Analysis (Reviewed)
### Gaps Classified as MINOR (Auto-Resolved)
- **UN/ECE unit codes**: Start with common codes (C62, KGM, H87, MTR, LTR, etc.), expand as needed
- **Tolerance**: Hardcode 0.01 EUR as specified
- **Validation scope**: Check required fields exist for declared profile
- **Error codes**: Implement as specified in user's detailed spec
### Gaps Classified as DEFAULTS APPLIED
- **Authentication**: OPEN (no auth mentioned in spec → stateless public API)
- **ZUGFeRD profiles**: ALL profiles supported (MINIMUM, BASIC, BASIC WL, EN16931, EXTENDED)
- **Deployment**: Container-based on NixOS (as per NixOS config section in spec)
- **PDF text extraction**: REQUIRED for pdf_abgleich check (explicitly in spec)
- **File size limit**: Handle as error for >10MB (spec mentions this edge case)
### Guardrails (Must NOT Have)
- NO authentication middleware
- NO database/persistence
- NO caching layers
- NO rate limiting
- NO metrics endpoints (beyond /health)
- NO CLI interface
- NO web UI
- NO abstraction layers for "future extensibility"

View File

@@ -0,0 +1,29 @@
# Decisions - zugferd-service
This file accumulates architectural and design decisions during execution.
## [2026-02-04T18:12:44.864Z] Session Start
Planning decisions from Prometheus session.
## Tech Stack
- Python 3.11+
- FastAPI for REST API
- Pydantic for data validation
- pytest for testing
- factur-x library for ZUGFeRD extraction
- pypdf for PDF text extraction
## Scope Lock
- NO authentication (stateless service)
- NO database
- NO caching
- NO rate limiting
- 10MB file size limit
- 0.01 EUR tolerance for calculations
## Validation Rules
- pflichtfelder: Required fields check
- betraege: Amount calculations check
- ustid: VAT ID format check (no online validation)
- pdf_abgleich: XML vs PDF text comparison

View File

@@ -0,0 +1,7 @@
# Issues - zugferd-service
This file accumulates problems, errors, and gotchas during execution.
## [2026-02-04T18:12:44.864Z] Session Start
Initial session started. No issues yet.

View File

@@ -0,0 +1,44 @@
# Learnings - zugferd-service
This file accumulates conventions, patterns, and learnings during execution.
## [2026-02-04T18:12:44.864Z] Session Start
Initial session for ZUGFeRD-Service implementation.
## Framework Decisions
- FastAPI (user preference)
- Pydantic v2+ for data models
- pytest with pytest-asyncio for testing
- hatchling for build system
## Packaging Decisions
- pyproject.toml (modern Python packaging)
- Docker multi-stage build
- Nix flake-based packaging with buildPythonApplication
## Testing Decisions
- TDD (test-first) approach
- All acceptance criteria must be verifiable without human intervention
## [2026-02-04T19:14:00.000Z] Task 1: Project Scaffold
### hatchling Configuration Pattern
- For src-layout projects, MUST add `[tool.hatch.build.targets.wheel]` section
- Without this, hatchling cannot determine which files to ship
- Config: `packages = ["src"]` to specify src directory
### Nix Environment Considerations
- Nix store is read-only, standard pip install fails
- Use temporary venv for verification: `python -m venv /tmp/test_env`
- Install to venv, verify imports, then cleanup
### Entry Point Documentation
- Functions referenced in `[project.scripts]` MUST have docstrings
- These are public API entry points (CLI commands)
- Example: `zugferd-service = "src.main:run"` -> run() needs docstring
### Module Docstring Convention
- Module-level docstrings: minimal, one line, describe purpose
- Entry point function docstrings: Args/Returns style for CLI documentation
- Both necessary for scaffolding clarity

View File

@@ -0,0 +1,7 @@
# Problems - zugferd-service
This file accumulates unresolved blockers.
## [2026-02-04T18:12:44.864Z] Session Start
No unresolved problems at this time.

File diff suppressed because it is too large Load Diff

35
pyproject.toml Normal file
View File

@@ -0,0 +1,35 @@
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[project]
name = "zugferd-service"
version = "1.0.0"
description = "REST API for ZUGFeRD invoice extraction and validation"
requires-python = ">=3.11"
dependencies = [
"fastapi>=0.109.0",
"uvicorn>=0.27.0",
"python-multipart>=0.0.6",
"factur-x>=2.5",
"pypdf>=4.0.0",
"pydantic>=2.5.0",
"lxml>=5.0.0",
]
[project.optional-dependencies]
dev = [
"pytest>=8.0.0",
"pytest-asyncio>=0.23.0",
"httpx>=0.27.0",
]
[project.scripts]
zugferd-service = "src.main:run"
[tool.hatch.build.targets.wheel]
packages = ["src"]
[tool.pytest.ini_options]
asyncio_mode = "auto"
testpaths = ["tests"]

3
src/__init__.py Normal file
View File

@@ -0,0 +1,3 @@
"""ZUGFeRD Service Package."""
__version__ = "1.0.0"

3
src/extractor.py Normal file
View File

@@ -0,0 +1,3 @@
"""ZUGFeRD extraction module."""
pass

20
src/main.py Normal file
View File

@@ -0,0 +1,20 @@
"""FastAPI application for ZUGFeRD invoice processing."""
import uvicorn
from fastapi import FastAPI
app = FastAPI(
title="ZUGFeRD Service",
version="1.0.0",
description="REST API for ZUGFeRD invoice extraction and validation",
)
def run(host: str = "0.0.0.0", port: int = 5000) -> None:
"""Run the FastAPI application.
Args:
host: Host to bind to.
port: Port to listen on.
"""
uvicorn.run(app, host=host, port=port)

3
src/models.py Normal file
View File

@@ -0,0 +1,3 @@
"""Pydantic models for ZUGFeRD service."""
pass

3
src/pdf_parser.py Normal file
View File

@@ -0,0 +1,3 @@
"""PDF text parsing module."""
pass

3
src/utils.py Normal file
View File

@@ -0,0 +1,3 @@
"""Utility functions for ZUGFeRD service."""
pass

3
src/validator.py Normal file
View File

@@ -0,0 +1,3 @@
"""Validation module for ZUGFeRD invoices."""
pass

1
tests/__init__.py Normal file
View File

@@ -0,0 +1 @@
"""Test suite for ZUGFeRD service."""

9
tests/conftest.py Normal file
View File

@@ -0,0 +1,9 @@
"""Pytest configuration and shared fixtures."""
import pytest
@pytest.fixture
def sample_pdf_bytes():
"""Fixture providing sample PDF bytes for testing."""
return b"%PDF-1.4\n%fake pdf content for testing\n%%EOF"

0
tests/fixtures/.gitkeep vendored Normal file
View File