Files

m3tm3re 1a01b46ed6 build(docker): add integration tests, Dockerfile, and docker-compose for packaging

2026-02-04 20:20:39 +01:00

52 KiB

Raw Blame History

ZUGFeRD-Service: Python REST API for Invoice Extraction and Validation

TL;DR

Quick Summary: Build a stateless FastAPI service that extracts structured data from ZUGFeRD/Factur-X invoices embedded in PDFs, validates invoice correctness, and is packaged for both Docker and NixOS deployment.

Deliverables:

Complete FastAPI application with 3 endpoints (/health, /extract, /validate)

Pydantic models for all request/response schemas

ZUGFeRD extraction pipeline using factur-x library

4 validation checks (pflichtfelder, betraege, ustid, pdf_abgleich)

Docker multi-stage build (production-ready)

Nix flake packaging (buildPythonApplication)

Comprehensive test suite (TDD, pytest)

README documentation with installation guide

Estimated Effort: Medium-Large (20-30 tasks) Parallel Execution: YES - 4 waves Critical Path: Project Setup → Models → Extractor → Validator → API → Packaging

Context

Original Request

Build a Python-based REST API service for ZUGFeRD invoice extraction and validation. The service runs as a Docker container on NixOS and should also be packaged as a Nix flake. Key requirements include handling all ZUGFeRD profiles (MINIMUM through EXTENDED), validating invoice amounts and required fields, and comparing XML data against PDF text.

Interview Summary

Key Discussions:

Framework choice: FastAPI (user preference, excellent for REST APIs)
Project format: pyproject.toml with hatchling (modern, Nix-friendly)
Nix packaging: Flake-based approach with buildPythonApplication
Testing strategy: TDD with pytest, source official sample PDFs
Authentication: Open endpoints (no auth required)
File handling: 10MB limit, reject password-protected PDFs
Calculations: Standard rounding, 0.01 EUR tolerance

Research Findings:

factur-x library provides get_xml_from_pdf(), get_level(), get_flavor() for extraction
Nix packaging follows mem0 pattern: pyproject=true, pythonRelaxDeps=true
UN/ECE unit codes require mapping dictionary (C62→Piece, KGM→Kilogram, etc.)
ZUGFeRD profiles detected via GuidelineSpecifiedDocumentContextParameter in XML

Metis Review

Identified Gaps (addressed):

Password-protected PDFs → Reject with error
File size limits → 10MB hard limit
Decimal precision → Standard rounding
Authentication → Confirmed open/stateless

Work Objectives

Core Objective

Create a production-ready, stateless REST API that extracts ZUGFeRD/Factur-X invoice data from PDFs, validates invoice correctness against business rules, and provides clear error reporting for invoice processing workflows.

Concrete Deliverables

src/main.py - FastAPI application with all endpoints
src/models.py - Pydantic models for all data structures
src/extractor.py - ZUGFeRD extraction logic
src/validator.py - Validation checks implementation
src/pdf_parser.py - PDF text extraction for cross-validation
src/utils.py - Helper functions (unit codes, decimal handling)
pyproject.toml - Project configuration with hatchling
Dockerfile - Multi-stage production build
docker-compose.yml - Local development setup
flake.nix - Nix flake packaging
tests/ - Complete test suite with fixtures
README.md - Installation and usage documentation

Definition of Done

nix build .#zugferd-service completes without errors
docker build -t zugferd-service . produces image <500MB
pytest runs all tests with 100% pass rate
curl http://localhost:5000/health returns {"status": "healthy", "version": "1.0.0"}
All ZUGFeRD profiles correctly detected from sample PDFs
All validation checks produce expected errors/warnings

Must Have

All 3 API endpoints as specified
Support for all ZUGFeRD profiles (MINIMUM, BASIC, BASIC WL, EN16931, EXTENDED)
All 4 validation checks (pflichtfelder, betraege, ustid, pdf_abgleich)
Structured JSON error responses with error codes
UTF-8 encoding throughout
Structured JSON logging

Must NOT Have (Guardrails)

❌ Authentication middleware (open endpoints)
❌ Database or persistence layer (stateless only)
❌ Caching layers
❌ Rate limiting
❌ Metrics endpoints beyond /health
❌ CLI interface
❌ Web UI or admin dashboard
❌ Batch processing or queue system
❌ Online USt-ID validation (format check only)
❌ Support for ZUGFeRD 1.x (only 2.x)
❌ Abstraction layers for "future extensibility"

Verification Strategy

UNIVERSAL RULE: ZERO HUMAN INTERVENTION

ALL tasks in this plan MUST be verifiable WITHOUT any human action. Every verification step is executed by the agent using tools (curl, pytest, nix, docker).

Test Decision

Infrastructure exists: NO (new project)
Automated tests: TDD (test-first)
Framework: pytest with pytest-asyncio

TDD Workflow

Each implementation task follows RED-GREEN-REFACTOR:

RED: Write failing test first
GREEN: Implement minimum code to pass
REFACTOR: Clean up while keeping tests green

Test Infrastructure Setup (Task 0)

# Install pytest and test dependencies
pip install pytest pytest-asyncio httpx

# Verify pytest works
pytest --version

# Run initial test
pytest tests/ -v

Agent-Executed QA Scenarios (MANDATORY)

All verifications use:

API Testing: curl commands with JSON assertions
Docker Testing: docker build and docker run commands
Nix Testing: nix build and nix run commands
Unit Testing: pytest with specific test file targets

Execution Strategy

Parallel Execution Waves

Wave 1 (Start Immediately):
├── Task 1: Project scaffold (pyproject.toml, directories)
├── Task 2: Download ZUGFeRD sample PDFs
└── Task 3: Create Pydantic models

Wave 2 (After Wave 1):
├── Task 4: Extractor unit tests + implementation
├── Task 5: PDF parser unit tests + implementation
└── Task 6: Utils (unit codes, decimal handling)

Wave 3 (After Wave 2):
├── Task 7: Validator unit tests + implementation
├── Task 8: FastAPI app structure
└── Task 9: Health endpoint

Wave 4 (After Wave 3):
├── Task 10: Extract endpoint
├── Task 11: Validate endpoint
└── Task 12: Error handling middleware

Wave 5 (After Wave 4):
├── Task 13: Integration tests
├── Task 14: Dockerfile
└── Task 15: docker-compose.yml

Wave 6 (After Wave 5):
├── Task 16: Nix flake packaging
├── Task 17: NixOS module example
└── Task 18: README documentation

Critical Path: Task 1 → Task 4 → Task 7 → Task 10 → Task 13 → Task 16

Dependency Matrix

Task	Depends On	Blocks	Can Parallelize With
1	None	2-18	None (must be first)
2	1	4, 5	3
3	1	4, 5, 7	2
4	2, 3	7, 10	5, 6
5	2	7	4, 6
6	1	4, 7	4, 5
7	4, 5, 6	11	8, 9
8	3	9, 10, 11	7
9	8	10, 11	7
10	4, 8, 9	13	11, 12
11	7, 8, 9	13	10, 12
12	8	13	10, 11
13	10, 11, 12	14, 16	None
14	13	16	15
15	13	None	14
16	14	17, 18	None
17	16	18	None
18	16, 17	None	None

TODOs

Wave 1: Project Foundation

1. Project Scaffold and Configuration

What to do:

Create project directory structure as specified
Create pyproject.toml with hatchling build system
Configure pytest in pyproject.toml
Create src/__init__.py and tests/__init__.py
Set up Python version requirement (3.11+)

Must NOT do:

Do NOT add optional dependencies
Do NOT create README yet (separate task)
Do NOT add pre-commit hooks

Recommended Agent Profile:

Category: quick
- Reason: Straightforward file creation with established patterns
Skills: [git-master]
- git-master: For proper initial commit after scaffold

Parallelization:

Can Run In Parallel: NO
Parallel Group: Sequential (must be first)
Blocks: All subsequent tasks
Blocked By: None

References:

Pattern References: mem0 pyproject.toml pattern (from librarian research)
External References: https://hatch.pypa.io/latest/config/metadata/ (hatchling docs)

Files to Create:

pyproject.toml
src/__init__.py
src/main.py (placeholder)
src/models.py (placeholder)
src/extractor.py (placeholder)
src/validator.py (placeholder)
src/pdf_parser.py (placeholder)
src/utils.py (placeholder)
tests/__init__.py
tests/conftest.py (pytest fixtures)
tests/fixtures/.gitkeep

pyproject.toml Structure:

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[project]
name = "zugferd-service"
version = "1.0.0"
description = "REST API for ZUGFeRD invoice extraction and validation"
requires-python = ">=3.11"
dependencies = [
    "fastapi>=0.109.0",
    "uvicorn>=0.27.0",
    "python-multipart>=0.0.6",
    "factur-x>=2.5",
    "pypdf>=4.0.0",
    "pydantic>=2.5.0",
    "lxml>=5.0.0",
]

[project.optional-dependencies]
dev = [
    "pytest>=8.0.0",
    "pytest-asyncio>=0.23.0",
    "httpx>=0.27.0",
]

[project.scripts]
zugferd-service = "src.main:run"

[tool.pytest.ini_options]
asyncio_mode = "auto"
testpaths = ["tests"]

Acceptance Criteria:

Agent-Executed QA Scenarios:

Scenario: pyproject.toml is valid
  Tool: Bash
  Steps:
    1. python -c "import tomllib; tomllib.load(open('pyproject.toml', 'rb'))"
    2. Assert: Exit code 0
  Expected Result: TOML parses without error
  Evidence: Command output captured

Scenario: Project structure exists
  Tool: Bash
  Steps:
    1. ls -la src/
    2. Assert: main.py, models.py, extractor.py, validator.py, pdf_parser.py, utils.py exist
    3. ls -la tests/
    4. Assert: conftest.py, fixtures/ exist
  Expected Result: All required files present
  Evidence: Directory listing captured

Scenario: Dependencies install correctly
  Tool: Bash
  Steps:
    1. pip install -e ".[dev]"
    2. Assert: Exit code 0
    3. python -c "import fastapi; import facturx; import pypdf"
    4. Assert: Exit code 0
  Expected Result: All dependencies importable
  Evidence: pip output captured

Commit: YES

Message: feat(project): initialize ZUGFeRD service with pyproject.toml and directory structure
Files: All created files
Pre-commit: python -c "import tomllib; tomllib.load(open('pyproject.toml', 'rb'))"

2. Download ZUGFeRD Sample PDFs

What to do:
- Download official ZUGFeRD sample PDFs from FeRD/ZUGFeRD repositories
- Include samples for: MINIMUM, BASIC, BASIC WL, EN16931, EXTENDED profiles
- Include a non-ZUGFeRD PDF for negative testing
- Store in tests/fixtures/
- Create fixture manifest documenting each file
Must NOT do:
- Do NOT create synthetic PDFs
- Do NOT download more than 10 samples (keep focused)
Recommended Agent Profile:
- Category: quick
  - Reason: Download and organize files
- Skills: []
  - No special skills needed
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 1 (with Task 3)
- Blocks: Tasks 4, 5
- Blocked By: Task 1
References:
- External References:
Acceptance Criteria:

Agent-Executed QA Scenarios:
```
Scenario: Sample PDFs exist for all profiles
  Tool: Bash
  Steps:
    1. ls tests/fixtures/*.pdf | wc -l
    2. Assert: At least 5 PDF files
    3. file tests/fixtures/*.pdf
    4. Assert: All files identified as "PDF document"
  Expected Result: Multiple valid PDF files in fixtures
  Evidence: File listing captured

Scenario: Fixture manifest documents samples
  Tool: Bash
  Steps:
    1. cat tests/fixtures/MANIFEST.md
    2. Assert: Contains entries for MINIMUM, BASIC, EN16931, EXTENDED
  Expected Result: Manifest describes all test fixtures
  Evidence: Manifest content captured
```
Commit: YES
- Message: test(fixtures): add official ZUGFeRD sample PDFs for all profiles
- Files: tests/fixtures/*.pdf, tests/fixtures/MANIFEST.md

3. Create Pydantic Models

What to do:

Define all Pydantic models as per API specification
Models: ExtractRequest, ExtractResponse, ValidateRequest, ValidateResponse
Nested models: Supplier, Buyer, LineItem, Totals, VatBreakdown, PaymentTerms
Error models: ErrorResponse, ValidationError
Add field validators where appropriate (e.g., VAT ID format)

Must NOT do:

Do NOT add ORM mappings (no database)
Do NOT add serialization beyond Pydantic defaults

Recommended Agent Profile:

Category: quick
- Reason: Straightforward Pydantic model definitions from spec
Skills: []
- Standard Python work

Parallelization:

Can Run In Parallel: YES
Parallel Group: Wave 1 (with Task 2)
Blocks: Tasks 4, 5, 7
Blocked By: Task 1

References:

API Specification: User's detailed spec document (request/response schemas)
Pattern References: https://docs.pydantic.dev/latest/concepts/models/

Key Models from Spec:

class Supplier(BaseModel):
    name: str
    street: str | None = None
    postal_code: str | None = None
    city: str | None = None
    country: str | None = None
    vat_id: str | None = None
    email: str | None = None

class LineItem(BaseModel):
    position: int
    article_number: str | None = None
    article_number_buyer: str | None = None
    description: str
    quantity: float
    unit: str  # Human-readable, translated from UN/ECE code
    unit_price: float
    line_total: float
    vat_rate: float | None = None
    vat_amount: float | None = None

class ExtractResponse(BaseModel):
    is_zugferd: bool
    zugferd_profil: str | None = None
    xml_raw: str | None = None
    xml_data: XmlData | None = None
    pdf_text: str | None = None
    extraction_meta: ExtractionMeta

Acceptance Criteria:

TDD - Write Tests First:

# tests/test_models.py
def test_extract_response_zugferd():
    response = ExtractResponse(
        is_zugferd=True,
        zugferd_profil="EN16931",
        xml_raw="<?xml ...",
        # ... full data
    )
    assert response.is_zugferd is True
    assert response.zugferd_profil == "EN16931"

def test_extract_response_non_zugferd():
    response = ExtractResponse(
        is_zugferd=False,
        pdf_text="Invoice text...",
        extraction_meta=ExtractionMeta(pages=1, extraction_time_ms=50)
    )
    assert response.xml_data is None

def test_line_item_validation():
    item = LineItem(
        position=1,
        description="Widget",
        quantity=10.0,
        unit="Piece",
        unit_price=9.99,
        line_total=99.90
    )
    assert item.line_total == 99.90

Agent-Executed QA Scenarios:

Scenario: Models pass all unit tests
  Tool: Bash
  Steps:
    1. pytest tests/test_models.py -v
    2. Assert: All tests pass (exit code 0)
  Expected Result: 100% test pass rate
  Evidence: pytest output captured

Scenario: Models serialize to JSON correctly
  Tool: Bash
  Steps:
    1. python -c "from src.models import ExtractResponse, ExtractionMeta; r = ExtractResponse(is_zugferd=False, extraction_meta=ExtractionMeta(pages=1, extraction_time_ms=50)); print(r.model_dump_json())"
    2. Assert: Valid JSON output containing "is_zugferd": false
  Expected Result: JSON serialization works
  Evidence: JSON output captured

Commit: YES

Message: feat(models): add Pydantic models for all API request/response schemas
Files: src/models.py, tests/test_models.py
Pre-commit: pytest tests/test_models.py

Wave 2: Core Extraction Logic

4. ZUGFeRD Extractor Implementation (TDD)

What to do:

Write tests first using sample PDFs from fixtures
Implement extract_zugferd() function using factur-x library
Handle profile detection (get_level, get_flavor)
Parse XML to structured data matching models
Handle non-ZUGFeRD PDFs gracefully
Handle errors (corrupt PDF, password-protected)

Must NOT do:

Do NOT implement custom XML parsing if factur-x provides it
Do NOT cache extracted data

Recommended Agent Profile:

Category: unspecified-high
- Reason: Core business logic requiring careful implementation
Skills: [systematic-debugging]
- systematic-debugging: For handling edge cases in PDF extraction

Parallelization:

Can Run In Parallel: YES
Parallel Group: Wave 2 (with Tasks 5, 6)
Blocks: Tasks 7, 10
Blocked By: Tasks 2, 3

References:

Library Docs: https://github.com/akretion/factur-x
Pattern References: factur-x get_xml_from_pdf usage (from librarian research)
Test Fixtures: tests/fixtures/*.pdf

Key Functions:

from facturx import get_xml_from_pdf, get_level, get_flavor

def extract_zugferd(pdf_bytes: bytes) -> ExtractResponse:
    """Extract ZUGFeRD data from PDF bytes."""
    # 1. Check file size (<10MB)
    # 2. Try to extract XML using factur-x
    # 3. Detect profile and flavor
    # 4. Parse XML to structured data
    # 5. Extract PDF text for pdf_abgleich
    # 6. Return ExtractResponse

Acceptance Criteria:

TDD - Write Tests First:

# tests/test_extractor.py
def test_extract_en16931_profile(sample_en16931_pdf):
    result = extract_zugferd(sample_en16931_pdf)
    assert result.is_zugferd is True
    assert result.zugferd_profil == "EN16931"
    assert result.xml_data is not None
    assert result.xml_data.invoice_number is not None

def test_extract_non_zugferd_pdf(sample_plain_pdf):
    result = extract_zugferd(sample_plain_pdf)
    assert result.is_zugferd is False
    assert result.xml_data is None
    assert result.pdf_text is not None

def test_extract_corrupt_pdf():
    with pytest.raises(ExtractionError) as exc:
        extract_zugferd(b"not a pdf")
    assert exc.value.error_code == "invalid_pdf"

def test_file_size_limit():
    large_pdf = b"x" * (11 * 1024 * 1024)  # 11MB
    with pytest.raises(ExtractionError) as exc:
        extract_zugferd(large_pdf)
    assert exc.value.error_code == "file_too_large"

Agent-Executed QA Scenarios:

Scenario: Extractor passes all unit tests
  Tool: Bash
  Steps:
    1. pytest tests/test_extractor.py -v
    2. Assert: All tests pass
  Expected Result: 100% test pass rate
  Evidence: pytest output captured

Scenario: EN16931 profile correctly detected
  Tool: Bash
  Steps:
    1. python -c "
       from src.extractor import extract_zugferd
       with open('tests/fixtures/sample_en16931.pdf', 'rb') as f:
           result = extract_zugferd(f.read())
       print(f'Profile: {result.zugferd_profil}')
       print(f'Invoice: {result.xml_data.invoice_number}')"
    2. Assert: Output contains "Profile: EN16931" or "Profile: en16931"
  Expected Result: Profile correctly identified
  Evidence: Script output captured

Scenario: Non-ZUGFeRD PDF handled gracefully
  Tool: Bash
  Steps:
    1. python -c "
       from src.extractor import extract_zugferd
       with open('tests/fixtures/sample_no_zugferd.pdf', 'rb') as f:
           result = extract_zugferd(f.read())
       assert result.is_zugferd == False
       assert result.pdf_text is not None
       print('OK: Non-ZUGFeRD handled correctly')"
    2. Assert: Output contains "OK:"
  Expected Result: Graceful handling
  Evidence: Script output captured

Commit: YES

Message: feat(extractor): implement ZUGFeRD extraction with profile detection
Files: src/extractor.py, tests/test_extractor.py
Pre-commit: pytest tests/test_extractor.py

5. PDF Text Parser Implementation (TDD)

What to do:

Write tests first with expected extraction patterns
Implement PDF text extraction using pypdf
Create regex patterns for extracting key values from text
Extract: invoice_number, invoice_date, amounts (net, gross, vat)
Return confidence scores for each extraction

Must NOT do:

Do NOT use OCR (text extraction only)
Do NOT parse images in PDFs

Recommended Agent Profile:

Category: unspecified-high
- Reason: Regex pattern development requires careful testing
Skills: [systematic-debugging]
- systematic-debugging: For developing and testing regex patterns

Parallelization:

Can Run In Parallel: YES
Parallel Group: Wave 2 (with Tasks 4, 6)
Blocks: Task 7
Blocked By: Task 2

References:

User Spec: Regex patterns provided in spec (invoice_number, gross_amount patterns)
Library Docs: https://pypdf.readthedocs.io/en/stable/

Key Patterns from Spec:

EXTRACTION_PATTERNS = {
    "invoice_number": [
        r"Rechnungs?-?(?:Nr|Nummer)[.:\s]*([A-Z0-9\-]+)",
        r"Invoice\s*(?:No|Number)?[.:\s]*([A-Z0-9\-]+)",
        r"Beleg-?Nr[.:\s]*([A-Z0-9\-]+)"
    ],
    "gross_amount": [
        r"Brutto[:\s]*([0-9.,]+)\s*(?:EUR|€)?",
        r"Gesamtbetrag[:\s]*([0-9.,]+)",
        r"Total[:\s]*([0-9.,]+)\s*(?:EUR|€)?"
    ],
    # ... more patterns
}

Acceptance Criteria:

TDD - Write Tests First:

# tests/test_pdf_parser.py
def test_extract_invoice_number():
    text = "Rechnung\nRechnungs-Nr.: RE-2025-001234\nDatum: 04.02.2025"
    result = extract_from_text(text)
    assert result["invoice_number"] == "RE-2025-001234"
    assert result["invoice_number_confidence"] >= 0.9

def test_extract_amounts():
    text = "Netto: 100,00 EUR\nMwSt 19%: 19,00 EUR\nBrutto: 119,00 EUR"
    result = extract_from_text(text)
    assert result["gross_amount"] == 119.00
    assert result["net_amount"] == 100.00

def test_german_number_format():
    text = "Gesamtbetrag: 1.234,56 €"
    result = extract_from_text(text)
    assert result["gross_amount"] == 1234.56

Agent-Executed QA Scenarios:

Scenario: PDF parser passes all unit tests
  Tool: Bash
  Steps:
    1. pytest tests/test_pdf_parser.py -v
    2. Assert: All tests pass
  Expected Result: 100% test pass rate
  Evidence: pytest output captured

Scenario: Real PDF text extraction works
  Tool: Bash
  Steps:
    1. python -c "
       from src.pdf_parser import extract_text_from_pdf, extract_from_text
       with open('tests/fixtures/sample_en16931.pdf', 'rb') as f:
           text = extract_text_from_pdf(f.read())
       print(f'Extracted {len(text)} characters')
       result = extract_from_text(text)
       print(f'Invoice: {result.get(\"invoice_number\", \"NOT FOUND\")}')"
    2. Assert: Output shows extracted characters and invoice number
  Expected Result: Text extraction works on real PDFs
  Evidence: Script output captured

Commit: YES

Message: feat(pdf-parser): implement PDF text extraction with regex patterns
Files: src/pdf_parser.py, tests/test_pdf_parser.py
Pre-commit: pytest tests/test_pdf_parser.py

6. Utility Functions Implementation

What to do:

Create UN/ECE unit code mapping dictionary
Implement decimal rounding helper (2 decimal places, standard rounding)
Implement tolerance comparison function (0.01 EUR)
Add German date format parser

Must NOT do:

Do NOT create configurable tolerance (hardcode 0.01)
Do NOT add unit codes beyond common ones (expand later if needed)

Recommended Agent Profile:

Category: quick
- Reason: Simple utility functions
Skills: []
- Standard Python work

Parallelization:

Can Run In Parallel: YES
Parallel Group: Wave 2 (with Tasks 4, 5)
Blocks: Tasks 4, 7
Blocked By: Task 1

References:

UN/ECE Codes: https://docs.peppol.eu/poacc/upgrade-3/codelist/UNECERec20/

Unit Code Dictionary:

UNECE_UNIT_CODES = {
    "C62": "Stück",
    "H87": "Stück",
    "KGM": "Kilogramm",
    "GRM": "Gramm",
    "TNE": "Tonne",
    "MTR": "Meter",
    "KMT": "Kilometer",
    "MTK": "Quadratmeter",
    "LTR": "Liter",
    "MLT": "Milliliter",
    "DAY": "Tag",
    "HUR": "Stunde",
    "MON": "Monat",
    "ANN": "Jahr",
    "SET": "Set",
    "PCE": "Stück",
    "EA": "Stück",
}

Acceptance Criteria:

Agent-Executed QA Scenarios:

Scenario: Unit code translation works
  Tool: Bash
  Steps:
    1. python -c "
       from src.utils import translate_unit_code
       assert translate_unit_code('C62') == 'Stück'
       assert translate_unit_code('KGM') == 'Kilogramm'
       assert translate_unit_code('UNKNOWN') == 'UNKNOWN'
       print('OK: Unit codes translate correctly')"
    2. Assert: Output contains "OK:"
  Expected Result: All translations correct
  Evidence: Script output captured

Scenario: Decimal comparison with tolerance
  Tool: Bash
  Steps:
    1. python -c "
       from src.utils import amounts_match
       assert amounts_match(100.00, 100.00) == True
       assert amounts_match(100.00, 100.005) == True  # Within 0.01
       assert amounts_match(100.00, 100.02) == False  # Outside tolerance
       print('OK: Tolerance comparison works')"
    2. Assert: Output contains "OK:"
  Expected Result: Tolerance logic correct
  Evidence: Script output captured

Commit: YES

Message: feat(utils): add unit code mapping and decimal utilities
Files: src/utils.py, tests/test_utils.py
Pre-commit: pytest tests/test_utils.py

Wave 3: Validation Logic

7. Validator Implementation (TDD)

What to do:

Write tests first for each validation check
Implement all 4 validation checks:
1. pflichtfelder - Required fields check
2. betraege - Amount calculations check
3. ustid - VAT ID format check
4. pdf_abgleich - XML vs PDF comparison
Return structured ValidationResult with errors and warnings

Must NOT do:

Do NOT implement online USt-ID validation
Do NOT add validation checks beyond spec

Recommended Agent Profile:

Category: unspecified-high
- Reason: Core business logic with multiple validation rules
Skills: [systematic-debugging]
- systematic-debugging: For handling edge cases in validation logic

Parallelization:

Can Run In Parallel: YES
Parallel Group: Wave 3 (with Tasks 8, 9)
Blocks: Task 11
Blocked By: Tasks 4, 5, 6

References:

User Spec: Detailed validation logic tables
Pattern References: src/models.py ValidationError model

Validation Rules from Spec:

pflichtfelder (Required Fields):

Field	Severity
invoice_number	critical
invoice_date	critical
supplier.name	critical
supplier.vat_id	critical
buyer.name	critical
totals.net	critical
totals.gross	critical
totals.vat_total	critical
due_date	warning
payment_terms.iban	warning
line_items (min 1)	critical

betraege (Calculations):

line_total ≈ quantity × unit_price (±0.01)
totals.net ≈ Σ(line_items.line_total) (±0.01)
vat_breakdown.amount ≈ base × (rate/100) (±0.01)
totals.vat_total ≈ Σ(vat_breakdown.amount) (±0.01)
totals.gross ≈ totals.net + totals.vat_total (±0.01)

ustid (VAT ID Format):

Country	Regex
DE	`^DE[0-9]{9}$`
AT	`^ATU[0-9]{8}$`
CH	`^CHE[0-9]{9}(MWST\|TVA\|IVA)$`

Acceptance Criteria:

TDD - Write Tests First:

# tests/test_validator.py
def test_pflichtfelder_missing_invoice_number():
    data = XmlData(invoice_number=None, ...)
    result = validate_pflichtfelder(data)
    assert any(e.field == "invoice_number" for e in result.errors)
    assert any(e.severity == "critical" for e in result.errors)

def test_betraege_calculation_mismatch():
    data = XmlData(
        line_items=[LineItem(quantity=10, unit_price=9.99, line_total=100.00)],  # Wrong!
        ...
    )
    result = validate_betraege(data)
    assert any(e.error_code == "calculation_mismatch" for e in result.errors)
    assert any("99.90" in e.message for e in result.errors)

def test_ustid_valid_german():
    result = validate_ustid("DE123456789")
    assert result.is_valid is True

def test_ustid_invalid_format():
    result = validate_ustid("DE12345")  # Too short
    assert result.is_valid is False
    assert result.error_code == "invalid_format"

def test_pdf_abgleich_mismatch():
    xml_data = XmlData(invoice_number="RE-001", totals=Totals(gross=118.88))
    pdf_values = {"invoice_number": "RE-002", "gross_amount": 118.88}
    result = validate_pdf_abgleich(xml_data, pdf_values)
    assert any(e.field == "invoice_number" for e in result.errors)

Agent-Executed QA Scenarios:

Scenario: Validator passes all unit tests
  Tool: Bash
  Steps:
    1. pytest tests/test_validator.py -v
    2. Assert: All tests pass
  Expected Result: 100% test pass rate
  Evidence: pytest output captured

Scenario: All validation checks work together
  Tool: Bash
  Steps:
    1. python -c "
       from src.validator import validate_invoice
       from src.models import XmlData, ValidateRequest
       # Create a request with known issues
       request = ValidateRequest(
           xml_data={'invoice_number': None, ...},
           checks=['pflichtfelder', 'betraege']
       )
       result = validate_invoice(request)
       print(f'Errors: {len(result.errors)}')"
    2. Assert: Script runs without error, shows error count
  Expected Result: Validator processes all checks
  Evidence: Script output captured

Commit: YES

Message: feat(validator): implement all validation checks (pflichtfelder, betraege, ustid, pdf_abgleich)
Files: src/validator.py, tests/test_validator.py
Pre-commit: pytest tests/test_validator.py

Wave 3 (continued): API Foundation

8. FastAPI Application Structure

What to do:
- Create FastAPI app instance in main.py
- Configure exception handlers for custom errors
- Set up structured JSON logging
- Add CORS middleware (for local development)
- Configure app metadata (title, version, description)
Must NOT do:
- Do NOT add authentication middleware
- Do NOT add rate limiting
Recommended Agent Profile:
- Category: quick
  - Reason: Standard FastAPI setup
- Skills: []
  - Standard FastAPI patterns
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 3 (with Tasks 7, 9)
- Blocks: Tasks 9, 10, 11
- Blocked By: Task 3
References:
- FastAPI Docs: https://fastapi.tiangolo.com/
- Pattern References: User spec error handling table
Acceptance Criteria:

Agent-Executed QA Scenarios:
```
Scenario: FastAPI app starts
  Tool: Bash
  Steps:
    1. timeout 5 uvicorn src.main:app --port 5001 &
    2. sleep 2
    3. curl -s http://localhost:5001/openapi.json | head -c 100
    4. Assert: Output contains "openapi"
    5. kill %1
  Expected Result: OpenAPI schema accessible
  Evidence: curl output captured
```
Commit: YES
- Message: feat(api): create FastAPI application structure with error handling
- Files: src/main.py
- Pre-commit: python -c "from src.main import app; print(app.title)"

9. Health Endpoint Implementation

What to do:
- Implement GET /health endpoint
- Return version from pyproject.toml
- Return simple health status
Must NOT do:
- Do NOT add complex health checks (no dependencies to check)
Recommended Agent Profile:
- Category: quick
  - Reason: Simple endpoint
- Skills: []
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 3 (with Tasks 7, 8)
- Blocks: Tasks 10, 11
- Blocked By: Task 8
References:
- User Spec: Health endpoint response format
Acceptance Criteria:

Agent-Executed QA Scenarios:
```
Scenario: Health endpoint returns correct format
  Tool: Bash
  Steps:
    1. uvicorn src.main:app --port 5002 &
    2. sleep 2
    3. curl -s http://localhost:5002/health
    4. Assert: Response contains {"status": "healthy", "version": "1.0.0"}
    5. kill %1
  Expected Result: Health check passes
  Evidence: curl output captured
```
Commit: YES (groups with Task 8)
- Message: feat(api): add health endpoint
- Files: src/main.py

Wave 4: API Endpoints

10. Extract Endpoint Implementation (TDD)

What to do:

Write integration tests for /extract endpoint
Implement POST /extract endpoint
Accept base64-encoded PDF in JSON body
Use extractor module for extraction
Return structured ExtractResponse

Must NOT do:

Do NOT accept multipart file uploads (JSON only per spec)

Recommended Agent Profile:

Category: unspecified-high
- Reason: Integration of extractor into API endpoint
Skills: []

Parallelization:

Can Run In Parallel: YES
Parallel Group: Wave 4 (with Tasks 11, 12)
Blocks: Task 13
Blocked By: Tasks 4, 8, 9

References:

User Spec: /extract request/response format
Pattern References: src/extractor.py

Acceptance Criteria:

TDD - Write Tests First:

# tests/test_api.py
@pytest.fixture
def client():
    return TestClient(app)

def test_extract_zugferd_pdf(client, sample_en16931_pdf_base64):
    response = client.post("/extract", json={"pdf_base64": sample_en16931_pdf_base64})
    assert response.status_code == 200
    data = response.json()
    assert data["is_zugferd"] is True
    assert data["zugferd_profil"] is not None

def test_extract_invalid_base64(client):
    response = client.post("/extract", json={"pdf_base64": "not-valid-base64!!!"})
    assert response.status_code == 400
    assert response.json()["error"] == "invalid_base64"

def test_extract_non_pdf(client):
    # Base64 of "Hello World" (not a PDF)
    response = client.post("/extract", json={"pdf_base64": "SGVsbG8gV29ybGQ="})
    assert response.status_code == 400
    assert response.json()["error"] == "invalid_pdf"

Agent-Executed QA Scenarios:

Scenario: Extract endpoint integration test
  Tool: Bash
  Steps:
    1. pytest tests/test_api.py::test_extract -v
    2. Assert: All extract tests pass
  Expected Result: Endpoint works correctly
  Evidence: pytest output captured

Scenario: Live extract endpoint test
  Tool: Bash
  Steps:
    1. uvicorn src.main:app --port 5003 &
    2. sleep 2
    3. PDF_BASE64=$(base64 -w 0 tests/fixtures/sample_en16931.pdf)
    4. curl -s -X POST http://localhost:5003/extract \
         -H "Content-Type: application/json" \
         -d "{\"pdf_base64\": \"$PDF_BASE64\"}" | jq '.is_zugferd'
    5. Assert: Output is "true"
    6. kill %1
  Expected Result: ZUGFeRD detected
  Evidence: curl output captured

Commit: YES

Message: feat(api): implement /extract endpoint for PDF processing
Files: src/main.py, tests/test_api.py
Pre-commit: pytest tests/test_api.py::test_extract

11. Validate Endpoint Implementation (TDD)

What to do:

Write integration tests for /validate endpoint
Implement POST /validate endpoint
Accept xml_data, pdf_text, and checks array
Use validator module for validation
Return structured ValidateResponse

Must NOT do:

Do NOT run checks not in the request's checks array

Recommended Agent Profile:

Category: unspecified-high
- Reason: Integration of validator into API endpoint
Skills: []

Parallelization:

Can Run In Parallel: YES
Parallel Group: Wave 4 (with Tasks 10, 12)
Blocks: Task 13
Blocked By: Tasks 7, 8, 9

References:

User Spec: /validate request/response format
Pattern References: src/validator.py

Acceptance Criteria:

TDD - Write Tests First:

def test_validate_all_checks(client):
    response = client.post("/validate", json={
        "xml_data": {...},
        "pdf_text": "...",
        "checks": ["pflichtfelder", "betraege", "ustid", "pdf_abgleich"]
    })
    assert response.status_code == 200
    data = response.json()
    assert "is_valid" in data
    assert "errors" in data
    assert "summary" in data

def test_validate_partial_checks(client):
    response = client.post("/validate", json={
        "xml_data": {...},
        "checks": ["pflichtfelder"]
    })
    assert response.status_code == 200
    # Only pflichtfelder check should run

Agent-Executed QA Scenarios:

Scenario: Validate endpoint integration test
  Tool: Bash
  Steps:
    1. pytest tests/test_api.py::test_validate -v
    2. Assert: All validate tests pass
  Expected Result: Endpoint works correctly
  Evidence: pytest output captured

Scenario: Live validate endpoint with invalid data
  Tool: Bash
  Steps:
    1. uvicorn src.main:app --port 5004 &
    2. sleep 2
    3. curl -s -X POST http://localhost:5004/validate \
         -H "Content-Type: application/json" \
         -d '{"xml_data": {"invoice_number": null}, "checks": ["pflichtfelder"]}' | jq '.is_valid'
    4. Assert: Output is "false"
    5. kill %1
  Expected Result: Validation detects missing field
  Evidence: curl output captured

Commit: YES

Message: feat(api): implement /validate endpoint for invoice validation
Files: src/main.py, tests/test_api.py
Pre-commit: pytest tests/test_api.py::test_validate

12. Error Handling Middleware

What to do:

Implement exception handlers for all error types
Map exceptions to HTTP status codes and error responses
Ensure all errors return JSON format
Add request logging

Must NOT do:

Do NOT expose stack traces in production

Recommended Agent Profile:

Category: quick
- Reason: Standard FastAPI error handling
Skills: []

Parallelization:

Can Run In Parallel: YES
Parallel Group: Wave 4 (with Tasks 10, 11)
Blocks: Task 13
Blocked By: Task 8

References:

User Spec: Error handling table

Error Mapping from Spec:

Error	Status	error_code
Invalid JSON	400	invalid_json
Not a PDF	400	invalid_pdf
PDF corrupt	400	corrupt_pdf
Base64 invalid	400	invalid_base64
File too large	400	file_too_large
Password protected	400	password_protected_pdf
Internal error	500	internal_error

Acceptance Criteria:

Agent-Executed QA Scenarios:

Scenario: Error responses are JSON
  Tool: Bash
  Steps:
    1. uvicorn src.main:app --port 5005 &
    2. sleep 2
    3. curl -s -X POST http://localhost:5005/extract \
         -H "Content-Type: application/json" \
         -d '{"pdf_base64": "invalid!!!"}' | jq '.error'
    4. Assert: Output is "invalid_base64"
    5. kill %1
  Expected Result: JSON error response
  Evidence: curl output captured

Commit: YES (groups with Task 10, 11)

Message: feat(api): add comprehensive error handling middleware
Files: src/main.py

Wave 5: Packaging

13. Integration Tests

What to do:
- Create end-to-end integration tests
- Test full workflow: extract → validate
- Test with all sample PDFs
- Test error scenarios
Must NOT do:
- Do NOT create performance tests
Recommended Agent Profile:
- Category: unspecified-high
  - Reason: Integration testing requires comprehensive coverage
- Skills: [systematic-debugging]
Parallelization:
- Can Run In Parallel: NO
- Parallel Group: Sequential (depends on all endpoints)
- Blocks: Tasks 14, 16
- Blocked By: Tasks 10, 11, 12
Acceptance Criteria:

Agent-Executed QA Scenarios:
```
Scenario: Full integration test suite passes
  Tool: Bash
  Steps:
    1. pytest tests/ -v --tb=short
    2. Assert: All tests pass (exit code 0)
  Expected Result: 100% test pass rate
  Evidence: pytest output captured
```
Commit: YES
- Message: test(integration): add end-to-end integration tests
- Files: tests/test_integration.py
- Pre-commit: pytest tests/

14. Dockerfile Creation

What to do:
- Create multi-stage Dockerfile as per spec
- Build stage: install dependencies
- Production stage: slim image with app
- Non-root user for security
- Expose port 5000
Must NOT do:
- Do NOT include dev dependencies in production image
Recommended Agent Profile:
- Category: quick
  - Reason: Standard Dockerfile from spec template
- Skills: []
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 5 (with Task 15)
- Blocks: Task 16
- Blocked By: Task 13
References:
- User Spec: Complete Dockerfile template provided
Acceptance Criteria:

Agent-Executed QA Scenarios:
```
Scenario: Docker build succeeds
  Tool: Bash
  Steps:
    1. docker build -t zugferd-service:test .
    2. Assert: Exit code 0
    3. docker images zugferd-service:test --format "{{.Size}}"
    4. Assert: Size < 500MB
  Expected Result: Image builds and is reasonably sized
  Evidence: Build output captured

Scenario: Container runs and responds
  Tool: Bash
  Steps:
    1. docker run -d --name zugferd-test -p 5006:5000 zugferd-service:test
    2. sleep 3
    3. curl -s http://localhost:5006/health | jq '.status'
    4. Assert: Output is "healthy"
    5. docker stop zugferd-test && docker rm zugferd-test
  Expected Result: Container is functional
  Evidence: curl output captured
```
Commit: YES
- Message: build(docker): add multi-stage Dockerfile for production
- Files: Dockerfile
- Pre-commit: docker build -t zugferd-service:test .

15. Docker Compose Configuration

What to do:
- Create docker-compose.yml for local development
- Include volume mount for live reload
- Configure environment variables
- Add health check
Must NOT do:
- Do NOT add additional services (no DB, no cache)
Recommended Agent Profile:
- Category: quick
  - Reason: Simple docker-compose setup
- Skills: []
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 5 (with Task 14)
- Blocks: None
- Blocked By: Task 13
Acceptance Criteria:

Agent-Executed QA Scenarios:
```
Scenario: Docker Compose starts service
  Tool: Bash
  Steps:
    1. docker-compose up -d
    2. sleep 5
    3. curl -s http://localhost:5000/health | jq '.status'
    4. Assert: Output is "healthy"
    5. docker-compose down
  Expected Result: Compose setup works
  Evidence: curl output captured
```
Commit: YES
- Message: build(docker): add docker-compose.yml for local development
- Files: docker-compose.yml

Wave 6: Nix Packaging

16. Nix Flake Packaging

What to do:

Create flake.nix with buildPythonApplication
Use pythonRelaxDeps for dependency flexibility
Include devShell for development
Test with nix build and nix run

Must NOT do:

Do NOT create complex overlay structure

Recommended Agent Profile:

Category: unspecified-high
- Reason: Nix packaging requires careful dependency handling
Skills: []

Parallelization:

Can Run In Parallel: NO
Parallel Group: Sequential
Blocks: Tasks 17, 18
Blocked By: Task 14

References:

Librarian Research: mem0 Nix packaging pattern
External References: https://github.com/mem0ai/mem0 Nix package

flake.nix Structure:

{
  description = "ZUGFeRD REST API Service";

  inputs = {
    nixpkgs.url = "github:NixOS/nixpkgs/nixos-unstable";
    flake-utils.url = "github:numtide/flake-utils";
  };

  outputs = { self, nixpkgs, flake-utils }:
    flake-utils.lib.eachDefaultSystem (system:
      let
        pkgs = nixpkgs.legacyPackages.${system};
        pythonPackages = pkgs.python311Packages;

        zugferd-service = pythonPackages.buildPythonApplication {
          pname = "zugferd-service";
          version = "1.0.0";
          pyproject = true;
          src = ./.;

          pythonRelaxDeps = true;

          build-system = [ pythonPackages.hatchling ];

          dependencies = with pythonPackages; [
            fastapi
            uvicorn
            pydantic
            python-multipart
            # factur-x - may need packaging
            pypdf
            lxml
          ];

          nativeCheckInputs = with pythonPackages; [
            pytestCheckHook
            pytest-asyncio
            httpx
          ];

          passthru = {
            mainProgram = "zugferd-service";
          };

          meta = {
            description = "REST API for ZUGFeRD invoice extraction";
            license = pkgs.lib.licenses.mit;
          };
        };
      in
      {
        packages.default = zugferd-service;
        packages.zugferd-service = zugferd-service;

        devShells.default = pkgs.mkShell {
          packages = [
            (pkgs.python311.withPackages (ps: with ps; [
              fastapi uvicorn pydantic pypdf lxml
              pytest pytest-asyncio httpx
            ]))
          ];
        };
      }
    );
}

Acceptance Criteria:

Agent-Executed QA Scenarios:

Scenario: Nix flake builds successfully
  Tool: Bash
  Steps:
    1. nix build .#zugferd-service
    2. Assert: Exit code 0
    3. ls -la result/bin/
    4. Assert: zugferd-service binary exists
  Expected Result: Nix package builds
  Evidence: Build output captured

Scenario: Nix package runs correctly
  Tool: Bash
  Steps:
    1. nix run .#zugferd-service &
    2. sleep 3
    3. curl -s http://localhost:5000/health | jq '.status'
    4. Assert: Output is "healthy"
    5. kill %1
  Expected Result: Nix-built service runs
  Evidence: curl output captured

Scenario: Dev shell provides dependencies
  Tool: Bash
  Steps:
    1. nix develop -c python -c "import fastapi; import pypdf; print('OK')"
    2. Assert: Output is "OK"
  Expected Result: Dev shell has all deps
  Evidence: Command output captured

Commit: YES

Message: build(nix): add flake.nix for Nix packaging
Files: flake.nix
Pre-commit: nix flake check

17. NixOS Service Module Example

What to do:
- Create example NixOS module for deployment
- Include service configuration options
- Add systemd service definition
- Document usage in README
Must NOT do:
- Do NOT create production-ready module (example only)
Recommended Agent Profile:
- Category: quick
  - Reason: Example module following standard patterns
- Skills: []
Parallelization:
- Can Run In Parallel: NO
- Parallel Group: Sequential
- Blocks: Task 18
- Blocked By: Task 16
References:
- User Spec: NixOS container configuration example
- Librarian Research: NixOS service module pattern
Acceptance Criteria:

Agent-Executed QA Scenarios:
```
Scenario: NixOS module syntax is valid
  Tool: Bash
  Steps:
    1. nix-instantiate --eval -E "import ./nix/module.nix"
    2. Assert: Exit code 0 or specific Nix evaluation output
  Expected Result: Module parses correctly
  Evidence: Nix output captured
```
Commit: YES
- Message: docs(nix): add example NixOS service module
- Files: nix/module.nix

18. README Documentation

What to do:
- Create comprehensive README.md
- Include: Overview, Installation, Usage, API Reference
- Add examples for Docker, Nix, and direct Python usage
- Document all endpoints with curl examples
- Include troubleshooting section
Must NOT do:
- Do NOT duplicate API spec (reference it)
Recommended Agent Profile:
- Category: writing
  - Reason: Documentation writing
- Skills: []
Parallelization:
- Can Run In Parallel: NO
- Parallel Group: Sequential (last task)
- Blocks: None
- Blocked By: Tasks 16, 17
README Structure:
```
# ZUGFeRD-Service

## Overview
## Quick Start
### Docker
### Nix
### Python (Development)
## API Reference
### GET /health
### POST /extract
### POST /validate
## Configuration
## NixOS Deployment
## Development
## Troubleshooting
## License
```
Acceptance Criteria:

Agent-Executed QA Scenarios:
```
Scenario: README contains all required sections
  Tool: Bash
  Steps:
    1. grep -c "## " README.md
    2. Assert: At least 8 sections
    3. grep "curl" README.md | wc -l
    4. Assert: At least 3 curl examples
  Expected Result: Comprehensive documentation
  Evidence: grep output captured
```
Commit: YES
- Message: docs: add comprehensive README with installation and usage guide
- Files: README.md

Commit Strategy

After Task	Message	Files	Verification
1	`feat(project): initialize ZUGFeRD service`	pyproject.toml, src/, tests/	toml parse
2	`test(fixtures): add ZUGFeRD sample PDFs`	tests/fixtures/	file exists
3	`feat(models): add Pydantic models`	src/models.py	pytest
4	`feat(extractor): implement extraction`	src/extractor.py	pytest
5	`feat(pdf-parser): implement PDF parsing`	src/pdf_parser.py	pytest
6	`feat(utils): add utilities`	src/utils.py	pytest
7	`feat(validator): implement validation`	src/validator.py	pytest
8-9	`feat(api): add FastAPI app + health`	src/main.py	curl
10	`feat(api): add /extract endpoint`	src/main.py	pytest
11	`feat(api): add /validate endpoint`	src/main.py	pytest
12	`feat(api): add error handling`	src/main.py	curl
13	`test(integration): add e2e tests`	tests/	pytest
14	`build(docker): add Dockerfile`	Dockerfile	docker build
15	`build(docker): add docker-compose`	docker-compose.yml	compose up
16	`build(nix): add flake.nix`	flake.nix	nix build
17	`docs(nix): add NixOS module`	nix/module.nix	nix eval
18	`docs: add README`	README.md	grep check

Success Criteria

Verification Commands

# All tests pass
pytest tests/ -v
# Expected: All tests pass (exit code 0)

# Docker builds and runs
docker build -t zugferd-service .
docker run -p 5000:5000 zugferd-service &
curl http://localhost:5000/health
# Expected: {"status": "healthy", "version": "1.0.0"}

# Nix builds and runs
nix build .#zugferd-service
./result/bin/zugferd-service &
curl http://localhost:5000/health
# Expected: {"status": "healthy", "version": "1.0.0"}

# Extract endpoint works
PDF_BASE64=$(base64 -w 0 tests/fixtures/sample_en16931.pdf)
curl -X POST http://localhost:5000/extract \
  -H "Content-Type: application/json" \
  -d "{\"pdf_base64\": \"$PDF_BASE64\"}" | jq '.is_zugferd'
# Expected: true

Final Checklist

All 18 tasks completed
All tests pass (pytest)
Docker image builds (<500MB)
Docker container runs and responds
Nix flake builds without errors
Nix package runs and responds
All endpoints return expected responses
README documents all features
No "Must NOT Have" items present

52 KiB Raw Blame History Unescape Escape

ZUGFeRD-Service: Python REST API for Invoice Extraction and Validation

TL;DR

Context

Original Request

Interview Summary

Metis Review

Work Objectives

Core Objective

Concrete Deliverables

Definition of Done

Must Have

Must NOT Have (Guardrails)

Verification Strategy

Test Decision

TDD Workflow

Test Infrastructure Setup (Task 0)

Agent-Executed QA Scenarios (MANDATORY)

Execution Strategy

Parallel Execution Waves

Dependency Matrix

TODOs

Wave 1: Project Foundation

Wave 2: Core Extraction Logic

Wave 3: Validation Logic

Wave 3 (continued): API Foundation

Wave 4: API Endpoints

Wave 5: Packaging

Wave 6: Nix Packaging

Commit Strategy

Success Criteria

Verification Commands

Final Checklist

52 KiB

Raw Blame History