BREAKING CHANGE: Major architecture overhaul removing LaTeX compilation - Removed embedded LaTeX compilation - Added OIDC/OAuth2 authentication with Nextcloud integration - Added email authentication with magic links - Implemented role-based access control (RBAC) - Added PDF template upload and field mapping - Implemented visual form designer capability - Created multi-stage approval workflow - Added voting mechanism for AStA members - Enhanced user dashboard with application tracking - Added comprehensive audit trail and history - Improved security with JWT tokens and encryption New Features: - OIDC single sign-on with automatic role mapping - Dual authentication (OIDC + Email) - Upload fillable PDFs as templates - Graphical field mapping interface - Configurable workflow with reviews and voting - Admin panel for role and permission management - Email notifications for status updates - Docker compose setup with Redis and MailHog Migration Required: - Database schema updates via Alembic - Configuration of OIDC provider - Upload of PDF templates to replace LaTeX - Role mapping configuration
420 lines
11 KiB
Markdown
420 lines
11 KiB
Markdown
# Backend Architecture Documentation
|
|
|
|
## Overview
|
|
|
|
The backend has been refactored from a monolithic structure into a modular, service-oriented architecture that emphasizes:
|
|
- **Separation of Concerns**: Clear boundaries between layers (API, Service, Repository, Model)
|
|
- **Dependency Injection**: Dynamic service resolution and configuration
|
|
- **Extensibility**: Plugin-based system for PDF variants and providers
|
|
- **Maintainability**: Organized code structure with single responsibility principle
|
|
- **Scalability**: Stateless services with proper connection pooling
|
|
|
|
## Directory Structure
|
|
|
|
```
|
|
backend/
|
|
├── src/
|
|
│ ├── api/ # API Layer
|
|
│ │ ├── routes/ # FastAPI routers
|
|
│ │ ├── middleware/ # Custom middleware
|
|
│ │ └── dependencies/ # Dependency injection helpers
|
|
│ │
|
|
│ ├── services/ # Business Logic Layer
|
|
│ │ ├── base.py # Base service classes
|
|
│ │ ├── application.py # Application business logic
|
|
│ │ ├── pdf.py # PDF processing service
|
|
│ │ └── auth.py # Authentication service
|
|
│ │
|
|
│ ├── repositories/ # Data Access Layer
|
|
│ │ ├── base.py # Base repository pattern
|
|
│ │ ├── application.py # Application repository
|
|
│ │ └── attachment.py # Attachment repository
|
|
│ │
|
|
│ ├── models/ # Database Models
|
|
│ │ ├── base.py # Base model with mixins
|
|
│ │ └── application.py # Application entities
|
|
│ │
|
|
│ ├── providers/ # Dynamic Providers
|
|
│ │ ├── pdf_qsm.py # QSM PDF variant provider
|
|
│ │ └── pdf_vsm.py # VSM PDF variant provider
|
|
│ │
|
|
│ ├── config/ # Configuration Management
|
|
│ │ └── settings.py # Centralized settings with Pydantic
|
|
│ │
|
|
│ ├── core/ # Core Infrastructure
|
|
│ │ ├── container.py # Dependency injection container
|
|
│ │ └── database.py # Database management
|
|
│ │
|
|
│ └── utils/ # Utility Functions
|
|
│ └── helpers.py # Common utilities
|
|
```
|
|
|
|
## Architecture Layers
|
|
|
|
### 1. API Layer (`api/`)
|
|
**Responsibility**: HTTP request/response handling, validation, routing
|
|
|
|
- **Routes**: Modular FastAPI routers for different domains
|
|
- **Middleware**: Cross-cutting concerns (rate limiting, logging, error handling)
|
|
- **Dependencies**: FastAPI dependency injection functions
|
|
|
|
```python
|
|
# Example: api/routes/applications.py
|
|
@router.post("/", response_model=ApplicationResponse)
|
|
async def create_application(
|
|
data: ApplicationCreate,
|
|
service: ApplicationService = Depends(get_application_service)
|
|
):
|
|
return await service.create(data.dict())
|
|
```
|
|
|
|
### 2. Service Layer (`services/`)
|
|
**Responsibility**: Business logic, orchestration, validation rules
|
|
|
|
- Encapsulates all business rules and workflows
|
|
- Coordinates between repositories and external services
|
|
- Handles complex validations and transformations
|
|
- Stateless and testable
|
|
|
|
```python
|
|
# Example: services/application.py
|
|
class ApplicationService(CRUDService[Application]):
|
|
def submit_application(self, id: int) -> Application:
|
|
# Business logic for submission
|
|
app = self.repository.get_or_404(id)
|
|
self._validate_submission(app)
|
|
app.status = ApplicationStatus.SUBMITTED
|
|
return self.repository.update(app)
|
|
```
|
|
|
|
### 3. Repository Layer (`repositories/`)
|
|
**Responsibility**: Data access abstraction, CRUD operations
|
|
|
|
- Implements repository pattern for database access
|
|
- Provides clean abstraction over SQLAlchemy
|
|
- Handles query building and optimization
|
|
- Transaction management
|
|
|
|
```python
|
|
# Example: repositories/application.py
|
|
class ApplicationRepository(BaseRepository[Application]):
|
|
def find_by_status(self, status: ApplicationStatus) -> List[Application]:
|
|
return self.query().filter(
|
|
Application.status == status
|
|
).all()
|
|
```
|
|
|
|
### 4. Model Layer (`models/`)
|
|
**Responsibility**: Data structure definition, ORM mapping
|
|
|
|
- SQLAlchemy models with proper relationships
|
|
- Base classes with common functionality (timestamps, soft delete)
|
|
- Model mixins for reusable behavior
|
|
- Business entity representation
|
|
|
|
```python
|
|
# Example: models/application.py
|
|
class Application(ExtendedBaseModel):
|
|
__tablename__ = "applications"
|
|
|
|
pa_id = Column(String(64), unique=True, index=True)
|
|
status = Column(SQLEnum(ApplicationStatus))
|
|
payload = Column(JSON)
|
|
```
|
|
|
|
## Key Components
|
|
|
|
### Dependency Injection Container
|
|
|
|
The system uses a custom dependency injection container for managing service lifecycles:
|
|
|
|
```python
|
|
# core/container.py
|
|
class Container:
|
|
def register_service(self, name: str, service_class: Type[BaseService]):
|
|
# Register service with automatic dependency resolution
|
|
|
|
def get_service(self, name: str) -> BaseService:
|
|
# Retrieve service instance with dependencies injected
|
|
```
|
|
|
|
**Benefits:**
|
|
- Loose coupling between components
|
|
- Easy testing with mock services
|
|
- Dynamic service configuration
|
|
- Singleton pattern support
|
|
|
|
### Configuration Management
|
|
|
|
Centralized configuration using Pydantic Settings:
|
|
|
|
```python
|
|
# config/settings.py
|
|
class Settings(BaseSettings):
|
|
database: DatabaseSettings
|
|
security: SecuritySettings
|
|
rate_limit: RateLimitSettings
|
|
storage: StorageSettings
|
|
pdf: PDFSettings
|
|
app: ApplicationSettings
|
|
```
|
|
|
|
**Features:**
|
|
- Environment variable support
|
|
- Type validation
|
|
- Default values
|
|
- Configuration file support (JSON/YAML)
|
|
- Dynamic override capability
|
|
|
|
### Provider Pattern for PDF Variants
|
|
|
|
Extensible system for handling different PDF types:
|
|
|
|
```python
|
|
# providers/pdf_qsm.py
|
|
class QSMProvider(PDFVariantProvider):
|
|
def parse_pdf_fields(self, fields: Dict) -> Dict:
|
|
# QSM-specific parsing logic
|
|
|
|
def map_payload_to_fields(self, payload: Dict) -> Dict:
|
|
# QSM-specific field mapping
|
|
```
|
|
|
|
**Advantages:**
|
|
- Easy to add new PDF variants
|
|
- Variant-specific validation rules
|
|
- Dynamic provider registration
|
|
- Clean separation of variant logic
|
|
|
|
## Database Architecture
|
|
|
|
### Base Model Classes
|
|
|
|
```python
|
|
# models/base.py
|
|
class BaseModel:
|
|
# Common fields and methods
|
|
|
|
class TimestampMixin:
|
|
created_at = Column(DateTime)
|
|
updated_at = Column(DateTime)
|
|
|
|
class SoftDeleteMixin:
|
|
is_deleted = Column(Boolean)
|
|
deleted_at = Column(DateTime)
|
|
|
|
class AuditMixin:
|
|
created_by = Column(String)
|
|
updated_by = Column(String)
|
|
```
|
|
|
|
### Connection Management
|
|
|
|
- Connection pooling with configurable size
|
|
- Automatic retry on connection failure
|
|
- Session scoping for transaction management
|
|
- Health check utilities
|
|
|
|
## Service Patterns
|
|
|
|
### CRUD Service Base
|
|
|
|
```python
|
|
class CRUDService(BaseService):
|
|
def create(self, data: Dict) -> T
|
|
def update(self, id: Any, data: Dict) -> T
|
|
def delete(self, id: Any, soft: bool = True) -> bool
|
|
def get(self, id: Any) -> Optional[T]
|
|
def list(self, filters: Dict, page: int, page_size: int) -> Dict
|
|
```
|
|
|
|
### Error Handling
|
|
|
|
Hierarchical exception system:
|
|
|
|
```python
|
|
ServiceException
|
|
├── ValidationError
|
|
├── BusinessRuleViolation
|
|
├── ResourceNotFoundError
|
|
└── ResourceConflictError
|
|
```
|
|
|
|
### Transaction Management
|
|
|
|
```python
|
|
with service.handle_errors("operation"):
|
|
with repository.transaction():
|
|
# Perform multiple operations
|
|
# Automatic rollback on error
|
|
```
|
|
|
|
## API Design
|
|
|
|
### RESTful Endpoints
|
|
|
|
```
|
|
POST /api/applications # Create application
|
|
GET /api/applications # List applications
|
|
GET /api/applications/{id} # Get application
|
|
PUT /api/applications/{id} # Update application
|
|
DELETE /api/applications/{id} # Delete application
|
|
|
|
POST /api/applications/{id}/submit # Submit application
|
|
POST /api/applications/{id}/review # Review application
|
|
GET /api/applications/{id}/pdf # Generate PDF
|
|
```
|
|
|
|
### Request/Response Models
|
|
|
|
Using Pydantic for validation:
|
|
|
|
```python
|
|
class ApplicationCreate(BaseModel):
|
|
variant: ApplicationType
|
|
payload: Dict[str, Any]
|
|
|
|
class ApplicationResponse(BaseModel):
|
|
id: int
|
|
pa_id: str
|
|
status: ApplicationStatus
|
|
created_at: datetime
|
|
```
|
|
|
|
## Middleware Stack
|
|
|
|
1. **CORS Middleware**: Cross-origin resource sharing
|
|
2. **Rate Limit Middleware**: Request throttling
|
|
3. **Logging Middleware**: Request/response logging
|
|
4. **Error Handler Middleware**: Global error handling
|
|
5. **Authentication Middleware**: JWT/API key validation
|
|
|
|
## Security Features
|
|
|
|
- JWT-based authentication
|
|
- API key support
|
|
- Rate limiting per IP/key
|
|
- SQL injection prevention via ORM
|
|
- Input sanitization
|
|
- Audit logging
|
|
|
|
## Performance Optimizations
|
|
|
|
- Database connection pooling
|
|
- Lazy loading relationships
|
|
- Query optimization with indexes
|
|
- Caching support (Redis)
|
|
- Async request handling
|
|
- PDF generation caching
|
|
|
|
## Testing Strategy
|
|
|
|
### Unit Tests
|
|
- Service logic testing
|
|
- Repository method testing
|
|
- Model validation testing
|
|
|
|
### Integration Tests
|
|
- API endpoint testing
|
|
- Database transaction testing
|
|
- PDF processing testing
|
|
|
|
### End-to-End Tests
|
|
- Complete workflow testing
|
|
- Multi-service interaction testing
|
|
|
|
## Deployment Considerations
|
|
|
|
### Environment Variables
|
|
|
|
```env
|
|
# Database
|
|
MYSQL_HOST=localhost
|
|
MYSQL_PORT=3306
|
|
MYSQL_DB=stupa
|
|
MYSQL_USER=user
|
|
MYSQL_PASSWORD=password
|
|
|
|
# Security
|
|
MASTER_KEY=secret_key
|
|
JWT_SECRET_KEY=jwt_secret
|
|
|
|
# Rate Limiting
|
|
RATE_IP_PER_MIN=60
|
|
RATE_KEY_PER_MIN=30
|
|
|
|
# PDF Templates
|
|
QSM_TEMPLATE=assets/qsm.pdf
|
|
VSM_TEMPLATE=assets/vsm.pdf
|
|
```
|
|
|
|
### Docker Support
|
|
|
|
```dockerfile
|
|
FROM python:3.11-slim
|
|
WORKDIR /app
|
|
COPY requirements.txt .
|
|
RUN pip install -r requirements.txt
|
|
COPY . .
|
|
CMD ["uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "8000"]
|
|
```
|
|
|
|
### Scaling Considerations
|
|
|
|
- Stateless services for horizontal scaling
|
|
- Database read replicas support
|
|
- Cache layer for frequently accessed data
|
|
- Async processing for heavy operations
|
|
- Message queue integration ready
|
|
|
|
## Migration Path
|
|
|
|
### From Old to New Architecture
|
|
|
|
1. **Phase 1**: Setup new structure alongside old code
|
|
2. **Phase 2**: Migrate database models
|
|
3. **Phase 3**: Implement service layer
|
|
4. **Phase 4**: Create API routes
|
|
5. **Phase 5**: Migrate business logic
|
|
6. **Phase 6**: Remove old code
|
|
|
|
### Database Migrations
|
|
|
|
Using Alembic for version control:
|
|
|
|
```bash
|
|
alembic init migrations
|
|
alembic revision --autogenerate -m "Initial migration"
|
|
alembic upgrade head
|
|
```
|
|
|
|
## Monitoring & Observability
|
|
|
|
- Structured logging with context
|
|
- Prometheus metrics integration
|
|
- Health check endpoints
|
|
- Performance profiling hooks
|
|
- Error tracking integration ready
|
|
|
|
## Future Enhancements
|
|
|
|
1. **GraphQL Support**: Alternative API interface
|
|
2. **WebSocket Support**: Real-time updates
|
|
3. **Event Sourcing**: Audit trail and history
|
|
4. **Microservices**: Service decomposition
|
|
5. **API Gateway**: Advanced routing and auth
|
|
6. **Message Queue**: Async task processing
|
|
7. **Search Engine**: Elasticsearch integration
|
|
8. **Machine Learning**: PDF field prediction
|
|
|
|
## Conclusion
|
|
|
|
This refactored architecture provides:
|
|
- **Maintainability**: Clear structure and separation
|
|
- **Scalability**: Ready for growth
|
|
- **Testability**: Isolated components
|
|
- **Extensibility**: Plugin-based design
|
|
- **Performance**: Optimized patterns
|
|
- **Security**: Built-in best practices
|
|
|
|
The modular design allows teams to work independently on different components while maintaining system integrity through well-defined interfaces. |