feat(api): implement /extract and /validate endpoints with error handling

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
This commit is contained in:
m3tm3re
2026-02-04 20:07:35 +01:00
parent 4791c91f06
commit 867b47efd0
5 changed files with 263 additions and 7 deletions

View File

@@ -424,3 +424,87 @@ async def health_check() -> HealthResponse:
- Import models in `src/main.py` using `from src.models import ModelName`
- Keep all data models centralized for consistency
- Exception: models local to a specific module can be defined there
## [2026-02-04T19:59:00.000Z] Task 11: Validate Endpoint Implementation
### Implementation
- Added POST /validate endpoint to src/main.py
- Endpoint accepts ValidateRequest (xml_data, pdf_text, checks)
- Returns ValidateResponse wrapping ValidationResult in "result" field
- Delegates to validate_invoice() from src.validator module
### Key Code Pattern
```python
@app.post("/validate", response_model=ValidateResponse)
async def validate_invoice_endpoint(request: ValidateRequest) -> ValidateResponse:
result = validate_invoice(request)
return ValidateResponse(result=result)
```
### Important Fix in Validator
- Updated validate_invoice() to handle empty checks gracefully
- If request.checks is empty, return early with ValidationResult(is_valid=True, ...)
- This prevents ValidationError when xml_data is empty but no checks need to run
### Testing
- test_validate_pflichtfelder: Tests valid invoice with pflichtfelder check
- test_validate_empty_checks: Tests empty checks list returns 200
- Both tests pass
### Validation Response Structure
Response contains nested "result" field:
```json
{
"result": {
"is_valid": false,
"errors": [...],
"warnings": [...],
"summary": {...},
"validation_time_ms": 45
}
}
```
### Docstring Justification
- Endpoint docstring provides API documentation for OpenAPI/Swagger
- Describes args (request type) and return (response type)
- Follows existing pattern from health_check endpoint
## Task 12: HTTPException Handler (2025-02-04)
### Pattern: Custom FastAPI Exception Handlers
FastAPI's default `HTTPException` returns nested `{"detail": {...}}` format which breaks API spec.
**Solution**: Add custom exception handler for `HTTPException` that returns flat JSON structure.
```python
@app.exception_handler(HTTPException)
async def http_exception_handler(request: Request, exc: HTTPException):
if isinstance(exc.detail, dict) and "error" in exc.detail:
return JSONResponse(
status_code=exc.status_code,
content={
"error": exc.detail.get("error"),
"message": exc.detail.get("message"),
},
)
return JSONResponse(
status_code=exc.status_code,
content={
"error": "http_error",
"message": str(exc.detail),
},
)
```
**Key Implementation Details**:
1. Handler checks if `exc.detail` is a dict with "error" key
2. If structured error (dict with error/message), extracts to flat format
3. Falls back to generic `{"error": "http_error", "message": str(exc.detail)}` for other cases
4. Preserves original status code from HTTPException
**Error Format Consistency**:
- All error responses now use flat structure: `{"error": "code", "message": "..."}`
- ExtractionError, HTTPException, and generic Exception handlers all follow this pattern
- Test `test_extract_invalid_base64` expects this flat format