BREAKING CHANGE: Major architecture overhaul removing LaTeX compilation - Removed embedded LaTeX compilation - Added OIDC/OAuth2 authentication with Nextcloud integration - Added email authentication with magic links - Implemented role-based access control (RBAC) - Added PDF template upload and field mapping - Implemented visual form designer capability - Created multi-stage approval workflow - Added voting mechanism for AStA members - Enhanced user dashboard with application tracking - Added comprehensive audit trail and history - Improved security with JWT tokens and encryption New Features: - OIDC single sign-on with automatic role mapping - Dual authentication (OIDC + Email) - Upload fillable PDFs as templates - Graphical field mapping interface - Configurable workflow with reviews and voting - Admin panel for role and permission management - Email notifications for status updates - Docker compose setup with Redis and MailHog Migration Required: - Database schema updates via Alembic - Configuration of OIDC provider - Upload of PDF templates to replace LaTeX - Role mapping configuration
11 KiB
Backend Architecture Documentation
Overview
The backend has been refactored from a monolithic structure into a modular, service-oriented architecture that emphasizes:
- Separation of Concerns: Clear boundaries between layers (API, Service, Repository, Model)
- Dependency Injection: Dynamic service resolution and configuration
- Extensibility: Plugin-based system for PDF variants and providers
- Maintainability: Organized code structure with single responsibility principle
- Scalability: Stateless services with proper connection pooling
Directory Structure
backend/
├── src/
│ ├── api/ # API Layer
│ │ ├── routes/ # FastAPI routers
│ │ ├── middleware/ # Custom middleware
│ │ └── dependencies/ # Dependency injection helpers
│ │
│ ├── services/ # Business Logic Layer
│ │ ├── base.py # Base service classes
│ │ ├── application.py # Application business logic
│ │ ├── pdf.py # PDF processing service
│ │ └── auth.py # Authentication service
│ │
│ ├── repositories/ # Data Access Layer
│ │ ├── base.py # Base repository pattern
│ │ ├── application.py # Application repository
│ │ └── attachment.py # Attachment repository
│ │
│ ├── models/ # Database Models
│ │ ├── base.py # Base model with mixins
│ │ └── application.py # Application entities
│ │
│ ├── providers/ # Dynamic Providers
│ │ ├── pdf_qsm.py # QSM PDF variant provider
│ │ └── pdf_vsm.py # VSM PDF variant provider
│ │
│ ├── config/ # Configuration Management
│ │ └── settings.py # Centralized settings with Pydantic
│ │
│ ├── core/ # Core Infrastructure
│ │ ├── container.py # Dependency injection container
│ │ └── database.py # Database management
│ │
│ └── utils/ # Utility Functions
│ └── helpers.py # Common utilities
Architecture Layers
1. API Layer (api/)
Responsibility: HTTP request/response handling, validation, routing
- Routes: Modular FastAPI routers for different domains
- Middleware: Cross-cutting concerns (rate limiting, logging, error handling)
- Dependencies: FastAPI dependency injection functions
# Example: api/routes/applications.py
@router.post("/", response_model=ApplicationResponse)
async def create_application(
data: ApplicationCreate,
service: ApplicationService = Depends(get_application_service)
):
return await service.create(data.dict())
2. Service Layer (services/)
Responsibility: Business logic, orchestration, validation rules
- Encapsulates all business rules and workflows
- Coordinates between repositories and external services
- Handles complex validations and transformations
- Stateless and testable
# Example: services/application.py
class ApplicationService(CRUDService[Application]):
def submit_application(self, id: int) -> Application:
# Business logic for submission
app = self.repository.get_or_404(id)
self._validate_submission(app)
app.status = ApplicationStatus.SUBMITTED
return self.repository.update(app)
3. Repository Layer (repositories/)
Responsibility: Data access abstraction, CRUD operations
- Implements repository pattern for database access
- Provides clean abstraction over SQLAlchemy
- Handles query building and optimization
- Transaction management
# Example: repositories/application.py
class ApplicationRepository(BaseRepository[Application]):
def find_by_status(self, status: ApplicationStatus) -> List[Application]:
return self.query().filter(
Application.status == status
).all()
4. Model Layer (models/)
Responsibility: Data structure definition, ORM mapping
- SQLAlchemy models with proper relationships
- Base classes with common functionality (timestamps, soft delete)
- Model mixins for reusable behavior
- Business entity representation
# Example: models/application.py
class Application(ExtendedBaseModel):
__tablename__ = "applications"
pa_id = Column(String(64), unique=True, index=True)
status = Column(SQLEnum(ApplicationStatus))
payload = Column(JSON)
Key Components
Dependency Injection Container
The system uses a custom dependency injection container for managing service lifecycles:
# core/container.py
class Container:
def register_service(self, name: str, service_class: Type[BaseService]):
# Register service with automatic dependency resolution
def get_service(self, name: str) -> BaseService:
# Retrieve service instance with dependencies injected
Benefits:
- Loose coupling between components
- Easy testing with mock services
- Dynamic service configuration
- Singleton pattern support
Configuration Management
Centralized configuration using Pydantic Settings:
# config/settings.py
class Settings(BaseSettings):
database: DatabaseSettings
security: SecuritySettings
rate_limit: RateLimitSettings
storage: StorageSettings
pdf: PDFSettings
app: ApplicationSettings
Features:
- Environment variable support
- Type validation
- Default values
- Configuration file support (JSON/YAML)
- Dynamic override capability
Provider Pattern for PDF Variants
Extensible system for handling different PDF types:
# providers/pdf_qsm.py
class QSMProvider(PDFVariantProvider):
def parse_pdf_fields(self, fields: Dict) -> Dict:
# QSM-specific parsing logic
def map_payload_to_fields(self, payload: Dict) -> Dict:
# QSM-specific field mapping
Advantages:
- Easy to add new PDF variants
- Variant-specific validation rules
- Dynamic provider registration
- Clean separation of variant logic
Database Architecture
Base Model Classes
# models/base.py
class BaseModel:
# Common fields and methods
class TimestampMixin:
created_at = Column(DateTime)
updated_at = Column(DateTime)
class SoftDeleteMixin:
is_deleted = Column(Boolean)
deleted_at = Column(DateTime)
class AuditMixin:
created_by = Column(String)
updated_by = Column(String)
Connection Management
- Connection pooling with configurable size
- Automatic retry on connection failure
- Session scoping for transaction management
- Health check utilities
Service Patterns
CRUD Service Base
class CRUDService(BaseService):
def create(self, data: Dict) -> T
def update(self, id: Any, data: Dict) -> T
def delete(self, id: Any, soft: bool = True) -> bool
def get(self, id: Any) -> Optional[T]
def list(self, filters: Dict, page: int, page_size: int) -> Dict
Error Handling
Hierarchical exception system:
ServiceException
├── ValidationError
├── BusinessRuleViolation
├── ResourceNotFoundError
└── ResourceConflictError
Transaction Management
with service.handle_errors("operation"):
with repository.transaction():
# Perform multiple operations
# Automatic rollback on error
API Design
RESTful Endpoints
POST /api/applications # Create application
GET /api/applications # List applications
GET /api/applications/{id} # Get application
PUT /api/applications/{id} # Update application
DELETE /api/applications/{id} # Delete application
POST /api/applications/{id}/submit # Submit application
POST /api/applications/{id}/review # Review application
GET /api/applications/{id}/pdf # Generate PDF
Request/Response Models
Using Pydantic for validation:
class ApplicationCreate(BaseModel):
variant: ApplicationType
payload: Dict[str, Any]
class ApplicationResponse(BaseModel):
id: int
pa_id: str
status: ApplicationStatus
created_at: datetime
Middleware Stack
- CORS Middleware: Cross-origin resource sharing
- Rate Limit Middleware: Request throttling
- Logging Middleware: Request/response logging
- Error Handler Middleware: Global error handling
- Authentication Middleware: JWT/API key validation
Security Features
- JWT-based authentication
- API key support
- Rate limiting per IP/key
- SQL injection prevention via ORM
- Input sanitization
- Audit logging
Performance Optimizations
- Database connection pooling
- Lazy loading relationships
- Query optimization with indexes
- Caching support (Redis)
- Async request handling
- PDF generation caching
Testing Strategy
Unit Tests
- Service logic testing
- Repository method testing
- Model validation testing
Integration Tests
- API endpoint testing
- Database transaction testing
- PDF processing testing
End-to-End Tests
- Complete workflow testing
- Multi-service interaction testing
Deployment Considerations
Environment Variables
# Database
MYSQL_HOST=localhost
MYSQL_PORT=3306
MYSQL_DB=stupa
MYSQL_USER=user
MYSQL_PASSWORD=password
# Security
MASTER_KEY=secret_key
JWT_SECRET_KEY=jwt_secret
# Rate Limiting
RATE_IP_PER_MIN=60
RATE_KEY_PER_MIN=30
# PDF Templates
QSM_TEMPLATE=assets/qsm.pdf
VSM_TEMPLATE=assets/vsm.pdf
Docker Support
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "8000"]
Scaling Considerations
- Stateless services for horizontal scaling
- Database read replicas support
- Cache layer for frequently accessed data
- Async processing for heavy operations
- Message queue integration ready
Migration Path
From Old to New Architecture
- Phase 1: Setup new structure alongside old code
- Phase 2: Migrate database models
- Phase 3: Implement service layer
- Phase 4: Create API routes
- Phase 5: Migrate business logic
- Phase 6: Remove old code
Database Migrations
Using Alembic for version control:
alembic init migrations
alembic revision --autogenerate -m "Initial migration"
alembic upgrade head
Monitoring & Observability
- Structured logging with context
- Prometheus metrics integration
- Health check endpoints
- Performance profiling hooks
- Error tracking integration ready
Future Enhancements
- GraphQL Support: Alternative API interface
- WebSocket Support: Real-time updates
- Event Sourcing: Audit trail and history
- Microservices: Service decomposition
- API Gateway: Advanced routing and auth
- Message Queue: Async task processing
- Search Engine: Elasticsearch integration
- Machine Learning: PDF field prediction
Conclusion
This refactored architecture provides:
- Maintainability: Clear structure and separation
- Scalability: Ready for growth
- Testability: Isolated components
- Extensibility: Plugin-based design
- Performance: Optimized patterns
- Security: Built-in best practices
The modular design allows teams to work independently on different components while maintaining system integrity through well-defined interfaces.