+ validation schema
This commit is contained in:
256
AGENTS.md
Normal file
256
AGENTS.md
Normal file
@@ -0,0 +1,256 @@
|
||||
# AGENTS.md - Agent Development Guide
|
||||
|
||||
This document provides context and guidelines for agentic coding agents working on the zugferd-service repository.
|
||||
|
||||
## Project Overview
|
||||
|
||||
ZUGFeRD-Service is a REST API for extracting and validating ZUGFeRD/Factur-X invoice data from PDF files. Built with FastAPI and Python 3.11+.
|
||||
|
||||
**Tech Stack:**
|
||||
- FastAPI >= 0.109.0 (web framework)
|
||||
- Uvicorn >= 0.27.0 (ASGI server)
|
||||
- Pydantic >= 2.5.0 (data validation)
|
||||
- factur-x >= 2.5 (ZUGFeRD/Factur-X library)
|
||||
- pypdf >= 4.0.0 (PDF text extraction)
|
||||
- lxml >= 5.0.0 (XML processing)
|
||||
|
||||
## Commands
|
||||
|
||||
### Development
|
||||
```bash
|
||||
# Install dependencies
|
||||
pip install -e .
|
||||
|
||||
# Run the service (default: 0.0.0.0:5000)
|
||||
python -m src.main
|
||||
zugferd-service # entry point
|
||||
|
||||
# With environment variables
|
||||
HOST=127.0.0.1 PORT=8000 LOG_LEVEL=DEBUG python -m src.main
|
||||
```
|
||||
|
||||
### Testing
|
||||
```bash
|
||||
# Run all tests
|
||||
pytest
|
||||
|
||||
# Run specific test file
|
||||
pytest tests/test_extract.py
|
||||
|
||||
# Run specific test function
|
||||
pytest tests/test_api.py::test_health_check
|
||||
|
||||
# Run with coverage
|
||||
pytest --cov=src
|
||||
|
||||
# Run with verbose output
|
||||
pytest -v
|
||||
```
|
||||
|
||||
### Building
|
||||
```bash
|
||||
# Docker build
|
||||
docker build -t zugferd-service .
|
||||
|
||||
# Nix build
|
||||
nix build .#zugferd-service
|
||||
|
||||
# Nix development shell
|
||||
nix develop
|
||||
```
|
||||
|
||||
## Code Style Guidelines
|
||||
|
||||
### Type Hints (Python 3.11+)
|
||||
Use modern union syntax (`|`) instead of `Optional` or `Union`:
|
||||
```python
|
||||
# Good
|
||||
field: str | None
|
||||
numbers: list[int] | None
|
||||
|
||||
# Avoid
|
||||
from typing import Optional, Union
|
||||
field: Optional[str]
|
||||
numbers: Union[list[int], None]
|
||||
```
|
||||
|
||||
All public functions must have type hints:
|
||||
```python
|
||||
def extract_zugferd(pdf_bytes: bytes) -> ExtractResponse:
|
||||
"""Extract ZUGFeRD data from PDF bytes.
|
||||
|
||||
Args:
|
||||
pdf_bytes: Raw PDF file content
|
||||
|
||||
Returns:
|
||||
ExtractResponse with extraction results
|
||||
"""
|
||||
```
|
||||
|
||||
### Imports
|
||||
- Group imports: standard library, third-party, local modules
|
||||
- Use `from typing import Any` only when needed
|
||||
- Avoid star imports (`from module import *`)
|
||||
|
||||
```python
|
||||
# Standard library
|
||||
import io
|
||||
import time
|
||||
from typing import Any
|
||||
|
||||
# Third-party
|
||||
from fastapi import FastAPI
|
||||
from lxml import etree
|
||||
from pydantic import BaseModel
|
||||
|
||||
# Local modules
|
||||
from src.models import ExtractResponse
|
||||
from src.utils import amounts_match
|
||||
```
|
||||
|
||||
### Naming Conventions
|
||||
- **Classes**: `PascalCase` (e.g., `ExtractionMeta`, `ValidateRequest`)
|
||||
- **Functions/variables**: `snake_case` (e.g., `extract_text_from_pdf`, `pdf_bytes`)
|
||||
- **Constants**: `SCREAMING_SNAKE_CASE` (e.g., `NAMESPACES`, `UNECE_UNIT_CODES`)
|
||||
- **Private**: `_leading_underscore` (e.g., `_parse_internal`)
|
||||
|
||||
### Pydantic Models
|
||||
All models defined in `src/models.py` using Pydantic v2:
|
||||
```python
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
class Supplier(BaseModel):
|
||||
"""Supplier/seller information."""
|
||||
|
||||
name: str = Field(description="Supplier name")
|
||||
vat_id: str | None = Field(default=None, description="VAT ID")
|
||||
```
|
||||
- Use `Field()` for all fields with descriptions
|
||||
- Use `default=None` for optional fields (not `None` in type hint)
|
||||
- Use `default_factory=list` for mutable defaults
|
||||
|
||||
### Error Handling
|
||||
**Custom exceptions** for domain-specific errors:
|
||||
```python
|
||||
class ExtractionError(Exception):
|
||||
"""Error during PDF extraction."""
|
||||
|
||||
def __init__(self, error_code: str, message: str, details: str = ""):
|
||||
self.error_code = error_code
|
||||
self.message = message
|
||||
self.details = details
|
||||
super().__init__(message)
|
||||
```
|
||||
|
||||
**FastAPI exception handlers** defined in `src/main.py`:
|
||||
- `ExtractionError` → 400 with error code/message
|
||||
- `HTTPException` → preserves status_code
|
||||
- Generic `Exception` → 500 internal error
|
||||
|
||||
**Raise HTTPException for validation errors:**
|
||||
```python
|
||||
from fastapi import HTTPException
|
||||
|
||||
raise HTTPException(
|
||||
status_code=400,
|
||||
detail={"error": "invalid_base64", "message": "Invalid base64 encoding"},
|
||||
)
|
||||
```
|
||||
|
||||
### Docstrings
|
||||
Use Google-style docstrings:
|
||||
```python
|
||||
def parse_supplier(xml_root: etree._Element) -> Supplier:
|
||||
"""Parse supplier information from XML.
|
||||
|
||||
Args:
|
||||
xml_root: XML root element
|
||||
|
||||
Returns:
|
||||
Supplier model with parsed data
|
||||
"""
|
||||
```
|
||||
|
||||
### XML Parsing
|
||||
Use `lxml.etree` with namespace-aware XPath:
|
||||
```python
|
||||
from lxml import etree
|
||||
|
||||
NAMESPACES = {
|
||||
"rsm": "urn:un:unece:uncefact:data:standard:CrossIndustryInvoice:100",
|
||||
"ram": "urn:un:unece:uncefact:data:standard:ReusableAggregateBusinessInformationEntity:100",
|
||||
}
|
||||
|
||||
# Use namespaces in all XPath queries
|
||||
name = xml_root.xpath(
|
||||
"//ram:ApplicableHeaderTradeAgreement/ram:SellerTradeParty/ram:Name/text()",
|
||||
namespaces=NAMESPACES,
|
||||
)
|
||||
```
|
||||
|
||||
### Logging
|
||||
Structured JSON logging via custom `JSONFormatter` in `src/main.py`:
|
||||
```python
|
||||
logger = logging.getLogger(__name__)
|
||||
logger.info("Extraction completed", extra={"data": {"profile": "EN16931"}})
|
||||
```
|
||||
|
||||
### Testing
|
||||
- Use `pytest` with `pytest-asyncio` for async tests
|
||||
- Use `TestClient` from `fastapi.testclient` for API tests
|
||||
- Define fixtures in `tests/conftest.py`
|
||||
- Test PDFs in `tests/fixtures/`
|
||||
|
||||
```python
|
||||
import pytest
|
||||
from fastapi.testclient import TestClient
|
||||
from src.main import app
|
||||
|
||||
@pytest.fixture
|
||||
def client():
|
||||
return TestClient(app)
|
||||
|
||||
def test_health_check(client):
|
||||
response = client.get("/health")
|
||||
assert response.status_code == 200
|
||||
assert response.json()["status"] == "healthy"
|
||||
```
|
||||
|
||||
### File Structure
|
||||
```
|
||||
src/
|
||||
├── __init__.py
|
||||
├── main.py # FastAPI app, endpoints, exception handlers
|
||||
├── models.py # Pydantic models (all requests/responses)
|
||||
├── extractor.py # ZUGFeRD XML extraction logic
|
||||
├── validator.py # Invoice validation logic
|
||||
├── pdf_parser.py # PDF text extraction
|
||||
└── utils.py # Utility functions (constants, helpers)
|
||||
|
||||
tests/
|
||||
├── conftest.py # Pytest fixtures
|
||||
├── test_api.py # API endpoint tests
|
||||
├── test_extractor.py # Extraction logic tests
|
||||
├── test_validator.py # Validation logic tests
|
||||
└── fixtures/ # Test PDF files
|
||||
```
|
||||
|
||||
## Validation Checks
|
||||
|
||||
Four validation checks supported:
|
||||
1. **pflichtfelder** - Required fields present and non-empty
|
||||
2. **betraege** - Amount calculations correct (tolerance: 0.01)
|
||||
3. **ustid** - VAT ID format (DE, AT, CH)
|
||||
4. **pdf_abgleich** - XML vs PDF text comparison
|
||||
|
||||
## Environment Variables
|
||||
- `HOST` (default: `0.0.0.0`)
|
||||
- `PORT` (default: `5000`)
|
||||
- `LOG_LEVEL` (default: `INFO`)
|
||||
|
||||
## Additional Notes
|
||||
- Python 3.11+ required
|
||||
- No type suppression (`# type: ignore`) allowed
|
||||
- File size limit: 10MB for PDF uploads
|
||||
- Returns warnings (not errors) for non-critical issues
|
||||
- Uses Decimal rounding with ROUND_HALF_UP for monetary values
|
||||
13
flake.nix
13
flake.nix
@@ -6,9 +6,12 @@
|
||||
flake-utils.url = "github:numtide/flake-utils";
|
||||
};
|
||||
|
||||
outputs = { self, nixpkgs, flake-utils }:
|
||||
flake-utils.lib.eachDefaultSystem (system:
|
||||
let
|
||||
outputs = {
|
||||
self,
|
||||
nixpkgs,
|
||||
flake-utils,
|
||||
}:
|
||||
flake-utils.lib.eachDefaultSystem (system: let
|
||||
pkgs = nixpkgs.legacyPackages.${system};
|
||||
pythonPackages = pkgs.python311Packages;
|
||||
|
||||
@@ -33,8 +36,7 @@
|
||||
pythonRelaxDeps = true;
|
||||
|
||||
meta = {
|
||||
description =
|
||||
"Python library to generate and read Factur-X invoices";
|
||||
description = "Python library to generate and read Factur-X invoices";
|
||||
license = pkgs.lib.licenses.mit;
|
||||
};
|
||||
};
|
||||
@@ -89,6 +91,7 @@
|
||||
pytest
|
||||
pytest-asyncio
|
||||
httpx
|
||||
factur-x
|
||||
# factur-x - install via pip if not available
|
||||
]))
|
||||
pkgs.python311Packages.pip
|
||||
|
||||
358
validator_schema.json
Normal file
358
validator_schema.json
Normal file
@@ -0,0 +1,358 @@
|
||||
{
|
||||
"$schema": "http://json-schema.org/draft-07/schema#",
|
||||
"$id": "https://example.com/zugferd-validator-schema.json",
|
||||
"title": "ZUGFeRD Invoice Validator Schema",
|
||||
"description": "JSON schema for /validate endpoint of ZUGFeRD service. Use this to structure OCR-extracted invoice data for validation.\n\n⚠️ CRITICAL INSTRUCTIONS FOR LLM:\n\n1. YOUR ROLE IS EXTRACTOR, NOT CALCULATOR\n2. ONLY extract values that EXPLICITLY appear in the document\n3. DO NOT perform ANY arithmetic calculations\n4. DO NOT infer or fabricate ANY values\n5. If a value is not visible in the document, set it to null (for optional fields) or indicate it's missing\n6. The validation service WILL verify all calculations - your job is only to extract what's written\n\nExample of WRONG behavior: quantity=10, unit_price=100.0 → line_total=1000.0 (CALCULATED - WRONG!)\nExample of CORRECT behavior: line_total=1000.0 appears on invoice → line_total=1000.0 (EXTRACTED - CORRECT!)\n\nIf the document shows:\n- \"10 × 100.00 = 1,000.00\" → Extract: quantity=10.0, unit_price=100.0, line_total=1000.0\n- \"10 × 100.00\" (no total shown) → Extract: quantity=10.0, unit_price=100.0, line_total=null (DO NOT calculate!)\n\nThe validator will check required fields, amount calculations, VAT ID formats, and PDF consistency.",
|
||||
"type": "object",
|
||||
"required": ["xml_data", "checks"],
|
||||
"properties": {
|
||||
"xml_data": {
|
||||
"type": "object",
|
||||
"description": "⚠️ EXTRACT ONLY - Extract all data EXACTLY as it appears in the invoice document. NEVER calculate, infer, or fabricate ANY value. The validation service will perform all necessary calculations to verify the data. If a value is not visible in the document, do NOT add it.",
|
||||
"required": [
|
||||
"invoice_number",
|
||||
"invoice_date",
|
||||
"supplier",
|
||||
"buyer",
|
||||
"line_items",
|
||||
"totals"
|
||||
],
|
||||
"properties": {
|
||||
"invoice_number": {
|
||||
"type": "string",
|
||||
"description": "Invoice reference number (e.g., RE-2024-001, INV-12345)",
|
||||
"examples": ["RE-2024-001", "INV-12345", "2024/001"]
|
||||
},
|
||||
"invoice_date": {
|
||||
"type": "string",
|
||||
"description": "Invoice issue date in ISO 8601 format (YYYY-MM-DD)",
|
||||
"pattern": "^\\d{4}-\\d{2}-\\d{2}$",
|
||||
"examples": ["2024-02-01", "2025-01-15"]
|
||||
},
|
||||
"due_date": {
|
||||
"type": ["string", "null"],
|
||||
"description": "Payment due date in ISO 8601 format (YYYY-MM-DD). Optional but recommended.",
|
||||
"pattern": "^\\d{4}-\\d{2}-\\d{2}$",
|
||||
"examples": ["2024-02-28", "2025-02-15"]
|
||||
},
|
||||
"supplier": {
|
||||
"type": "object",
|
||||
"description": "Supplier/seller information (the entity sending the invoice)",
|
||||
"required": ["name"],
|
||||
"properties": {
|
||||
"name": {
|
||||
"type": "string",
|
||||
"description": "Legal company name of the supplier (required)",
|
||||
"examples": ["Acme Corp GmbH", "Test Supplier AG", "Example Services Ltd."]
|
||||
},
|
||||
"street": {
|
||||
"type": ["string", "null"],
|
||||
"description": "Street address",
|
||||
"examples": ["Main Street 123", "Musterstraße 45"]
|
||||
},
|
||||
"postal_code": {
|
||||
"type": ["string", "null"],
|
||||
"description": "Postal/ZIP code",
|
||||
"examples": ["12345", "D-12345", "10115"]
|
||||
},
|
||||
"city": {
|
||||
"type": ["string", "null"],
|
||||
"description": "City name",
|
||||
"examples": ["Berlin", "München", "Hamburg"]
|
||||
},
|
||||
"country": {
|
||||
"type": ["string", "null"],
|
||||
"description": "Two-letter ISO country code",
|
||||
"pattern": "^[A-Z]{2}$",
|
||||
"examples": ["DE", "AT", "CH", "FR"]
|
||||
},
|
||||
"vat_id": {
|
||||
"type": ["string", "null"],
|
||||
"description": "VAT identification number. Format is validated: DE + 9 digits (Germany), ATU + 8 digits (Austria), CHE + 9 digits + MWST/TVA/IVA (Switzerland).",
|
||||
"examples": [
|
||||
"DE123456789",
|
||||
"ATU12345678",
|
||||
"CHE123456789MWST"
|
||||
]
|
||||
},
|
||||
"email": {
|
||||
"type": ["string", "null"],
|
||||
"description": "Email address for billing inquiries",
|
||||
"format": "email",
|
||||
"examples": ["billing@acme-corp.de", "rechnung@example.com"]
|
||||
}
|
||||
}
|
||||
},
|
||||
"buyer": {
|
||||
"type": "object",
|
||||
"description": "Buyer/customer information (the entity receiving the invoice)",
|
||||
"required": ["name"],
|
||||
"properties": {
|
||||
"name": {
|
||||
"type": "string",
|
||||
"description": "Legal company name of the buyer (required)",
|
||||
"examples": ["Customer GmbH", "Buyer Company AG", "Client Ltd."]
|
||||
},
|
||||
"street": {
|
||||
"type": ["string", "null"],
|
||||
"description": "Street address",
|
||||
"examples": ["Market Square 5", "Kundenweg 10"]
|
||||
},
|
||||
"postal_code": {
|
||||
"type": ["string", "null"],
|
||||
"description": "Postal/ZIP code",
|
||||
"examples": ["54321", "D-54321", "80331"]
|
||||
},
|
||||
"city": {
|
||||
"type": ["string", "null"],
|
||||
"description": "City name",
|
||||
"examples": ["Hamburg", "Frankfurt", "München"]
|
||||
},
|
||||
"country": {
|
||||
"type": ["string", "null"],
|
||||
"description": "Two-letter ISO country code",
|
||||
"pattern": "^[A-Z]{2}$",
|
||||
"examples": ["DE", "AT", "CH", "NL"]
|
||||
},
|
||||
"vat_id": {
|
||||
"type": ["string", "null"],
|
||||
"description": "VAT identification number. Same format validation as supplier.",
|
||||
"examples": [
|
||||
"DE987654321",
|
||||
"ATU87654321",
|
||||
"CHE987654321TVA"
|
||||
]
|
||||
}
|
||||
}
|
||||
},
|
||||
"line_items": {
|
||||
"type": "array",
|
||||
"description": "List of line items (products or services billed). Extract each line exactly as it appears on the invoice. The validation service will verify calculations. DO NOT perform any arithmetic yourself.",
|
||||
"minItems": 1,
|
||||
"items": {
|
||||
"type": "object",
|
||||
"required": ["position", "description", "quantity", "unit", "unit_price", "line_total"],
|
||||
"properties": {
|
||||
"position": {
|
||||
"type": "integer",
|
||||
"description": "Sequential line number (1, 2, 3, ...)",
|
||||
"minimum": 1,
|
||||
"examples": [1, 2, 3]
|
||||
},
|
||||
"article_number": {
|
||||
"type": ["string", "null"],
|
||||
"description": "Supplier's article/SKU number",
|
||||
"examples": ["ART-001", "SKU-12345", "PROD-999"]
|
||||
},
|
||||
"article_number_buyer": {
|
||||
"type": ["string", "null"],
|
||||
"description": "Buyer's internal article number (if different from supplier's)",
|
||||
"examples": ["BUY-001", "MAT-456"]
|
||||
},
|
||||
"description": {
|
||||
"type": "string",
|
||||
"description": "Item description (required). Should clearly describe the product or service.",
|
||||
"examples": ["Consulting Services", "Office Chair", "Software License - Annual"]
|
||||
},
|
||||
"quantity": {
|
||||
"type": "number",
|
||||
"description": "Quantity ordered/delivered (required, must be > 0)",
|
||||
"minimum": 0.01,
|
||||
"examples": [1.0, 10.0, 2.5]
|
||||
},
|
||||
"unit": {
|
||||
"type": "string",
|
||||
"description": "Unit of measurement (required). Use human-readable name, not code.",
|
||||
"examples": ["Stück", "Stunde", "Kilogramm", "Meter", "Set"]
|
||||
},
|
||||
"unit_price": {
|
||||
"type": "number",
|
||||
"description": "Price per unit (required, must be > 0)",
|
||||
"minimum": 0.01,
|
||||
"examples": [100.0, 49.99, 12.50]
|
||||
},
|
||||
"line_total": {
|
||||
"type": "number",
|
||||
"description": "⚠️ EXTRACT ONLY - Do NOT calculate. Extract the line total if it EXPLICITLY appears on the invoice (e.g., \"10 × 100.00 = 1,000.00\" shows 1,000.00). If the document shows quantity and unit_price but NO line total, set this to null. The validation service will verify the calculation. NEVER compute quantity × unit_price yourself.",
|
||||
"minimum": 0.01,
|
||||
"examples": [100.0, 499.90, 31.25]
|
||||
},
|
||||
"vat_rate": {
|
||||
"type": ["number", "null"],
|
||||
"description": "VAT rate percentage for this line (optional). Extract if shown on the line item (e.g., \"19%\" or \"7% USt\").",
|
||||
"examples": [19.0, 7.0, 0.0]
|
||||
},
|
||||
"vat_amount": {
|
||||
"type": ["number", "null"],
|
||||
"description": "⚠️ EXTRACT ONLY - Do NOT calculate. Extract the VAT amount for this line if it EXPLICITLY appears on the invoice. If not shown, set to null. NEVER compute line_total × (vat_rate / 100) yourself.",
|
||||
"minimum": 0,
|
||||
"examples": [19.0, 6.93, 0.0]
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"totals": {
|
||||
"type": "object",
|
||||
"description": "⚠️ EXTRACT ONLY - Extract totals from the invoice's totals section (usually at bottom of document). DO NOT calculate from line items. Look for: \"Netto:\", \"Zwischensumme\", \"MwSt\", \"USt\", \"Brutto\", \"Rechnungsbetrag\" labels and extract the values shown. The validation service will verify all calculations match.",
|
||||
"required": ["line_total_sum", "net", "vat_total", "gross"],
|
||||
"properties": {
|
||||
"line_total_sum": {
|
||||
"type": "number",
|
||||
"description": "⚠️ EXTRACT ONLY - Look for \"Zwischensumme\", \"Summe Positionen\", \"Subtotal\" or similar label. Extract the value shown. DO NOT sum line items yourself.",
|
||||
"examples": [1000.0, 2499.90, 150.0]
|
||||
},
|
||||
"net": {
|
||||
"type": "number",
|
||||
"description": "⚠️ EXTRACT ONLY - Look for \"Netto\", \"Net amount\", \"Summe netto\" label and extract that value. DO NOT calculate.",
|
||||
"examples": [1000.0, 2499.90, 150.0]
|
||||
},
|
||||
"vat_total": {
|
||||
"type": "number",
|
||||
"description": "⚠️ EXTRACT ONLY - Look for \"MwSt\", \"USt\", \"VAT total\", \"Steuer\" label and extract that value. DO NOT calculate.",
|
||||
"examples": [190.0, 449.98, 28.5]
|
||||
},
|
||||
"gross": {
|
||||
"type": "number",
|
||||
"description": "⚠️ EXTRACT ONLY - Look for \"Brutto\", \"Rechnungsbetrag\", \"Total\", \"Gesamtbetrag\" label and extract that value. DO NOT add net + vat_total yourself.",
|
||||
"examples": [1190.0, 2949.88, 178.5]
|
||||
},
|
||||
"vat_breakdown": {
|
||||
"type": "array",
|
||||
"description": "⚠️ EXTRACT ONLY - Look for a VAT breakdown table (e.g., \"19%\" row showing base and amount). Only extract rows that EXPLICITLY appear in the document. If no VAT breakdown table exists, set to empty array []. DO NOT calculate any values yourself.",
|
||||
"items": {
|
||||
"type": "object",
|
||||
"required": ["rate", "base", "amount"],
|
||||
"properties": {
|
||||
"rate": {
|
||||
"type": "number",
|
||||
"description": "Extract the VAT rate shown (e.g., \"19%\", \"7%\")",
|
||||
"examples": [19.0, 7.0, 0.0]
|
||||
},
|
||||
"base": {
|
||||
"type": "number",
|
||||
"description": "Extract the tax base amount shown for this rate",
|
||||
"examples": [1000.0, 499.90, 150.0]
|
||||
},
|
||||
"amount": {
|
||||
"type": "number",
|
||||
"description": "Extract the VAT amount shown for this rate. DO NOT calculate base × rate / 100.",
|
||||
"examples": [190.0, 34.99, 0.0]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"currency": {
|
||||
"type": "string",
|
||||
"description": "Three-letter ISO currency code (default: EUR)",
|
||||
"default": "EUR",
|
||||
"pattern": "^[A-Z]{3}$",
|
||||
"examples": ["EUR", "USD", "GBP"]
|
||||
},
|
||||
"payment_terms": {
|
||||
"type": ["object", "null"],
|
||||
"description": "Payment information (optional)",
|
||||
"properties": {
|
||||
"iban": {
|
||||
"type": ["string", "null"],
|
||||
"description": "International Bank Account Number",
|
||||
"examples": ["DE89370400440532013000", "AT611904300234573201"]
|
||||
},
|
||||
"bic": {
|
||||
"type": ["string", "null"],
|
||||
"description": "Bank Identifier Code / SWIFT",
|
||||
"examples": ["COBADEFFXXX", "BKAUATWW"]
|
||||
},
|
||||
"account_holder": {
|
||||
"type": ["string", "null"],
|
||||
"description": "Name of the account holder",
|
||||
"examples": ["Acme Corp GmbH", "Test Supplier AG"]
|
||||
}
|
||||
}
|
||||
},
|
||||
"notes": {
|
||||
"type": ["string", "null"],
|
||||
"description": "Additional notes or comments from the invoice",
|
||||
"examples": ["Payment due within 30 days", "Thank you for your business"]
|
||||
}
|
||||
}
|
||||
},
|
||||
"pdf_text": {
|
||||
"type": ["string", "null"],
|
||||
"description": "Raw text extracted from the PDF document. Used for 'pdf_abgleich' validation check to compare XML data with PDF text.",
|
||||
"examples": ["Invoice RE-2024-001\nTotal: 1190.00 EUR\n..."]
|
||||
},
|
||||
"checks": {
|
||||
"type": "array",
|
||||
"description": "List of validation checks to run. Each check validates different aspects of the invoice data.",
|
||||
"minItems": 1,
|
||||
"items": {
|
||||
"type": "string",
|
||||
"enum": ["pflichtfelder", "betraege", "ustid", "pdf_abgleich"],
|
||||
"description": "Validation check type. Choose one or more:\n- pflichtfelder: Validates all required fields are present and non-empty (invoice_number, invoice_date, supplier.name, supplier.vat_id, buyer.name, totals fields, line_items)\n- betraege: Validates all monetary calculations are correct (line totals, VAT breakdown, totals)\n- ustid: Validates VAT ID format for DE, AT, CH\n- pdf_abgleich: Compares XML data against PDF text (requires pdf_text)"
|
||||
}
|
||||
}
|
||||
},
|
||||
"examples": [
|
||||
{
|
||||
"xml_data": {
|
||||
"invoice_number": "RE-2024-001",
|
||||
"invoice_date": "2024-02-01",
|
||||
"due_date": "2024-02-28",
|
||||
"supplier": {
|
||||
"name": "Acme Corp GmbH",
|
||||
"street": "Main Street 123",
|
||||
"postal_code": "12345",
|
||||
"city": "Berlin",
|
||||
"country": "DE",
|
||||
"vat_id": "DE123456789",
|
||||
"email": "billing@acme-corp.de"
|
||||
},
|
||||
"buyer": {
|
||||
"name": "Customer GmbH",
|
||||
"street": "Market Square 5",
|
||||
"postal_code": "54321",
|
||||
"city": "Hamburg",
|
||||
"country": "DE",
|
||||
"vat_id": "DE987654321"
|
||||
},
|
||||
"line_items": [
|
||||
{
|
||||
"position": 1,
|
||||
"article_number": "ART-001",
|
||||
"description": "Consulting Services",
|
||||
"quantity": 10.0,
|
||||
"unit": "Stunde",
|
||||
"unit_price": 100.0,
|
||||
"line_total": 1000.0,
|
||||
"vat_rate": 19.0,
|
||||
"vat_amount": 190.0
|
||||
}
|
||||
],
|
||||
"totals": {
|
||||
"line_total_sum": 1000.0,
|
||||
"net": 1000.0,
|
||||
"vat_total": 190.0,
|
||||
"gross": 1190.0,
|
||||
"vat_breakdown": [
|
||||
{
|
||||
"rate": 19.0,
|
||||
"base": 1000.0,
|
||||
"amount": 190.0
|
||||
}
|
||||
]
|
||||
},
|
||||
"currency": "EUR",
|
||||
"payment_terms": {
|
||||
"iban": "DE89370400440532013000",
|
||||
"bic": "COBADEFFXXX",
|
||||
"account_holder": "Acme Corp GmbH"
|
||||
},
|
||||
"notes": "Payment due within 30 days"
|
||||
},
|
||||
"pdf_text": "Invoice RE-2024-001\nAcme Corp GmbH\n...\nTotal: 1190.00 EUR",
|
||||
"checks": ["pflichtfelder", "betraege", "ustid", "pdf_abgleich"]
|
||||
}
|
||||
]
|
||||
}
|
||||
Reference in New Issue
Block a user