260 lines
6.4 KiB
Markdown
260 lines
6.4 KiB
Markdown
# PDF Integration Guide
|
|
|
|
This document describes how the STUPA PDF API integrates with LaTeX templates to generate and process PDF forms.
|
|
|
|
## Overview
|
|
|
|
The STUPA PDF API system works with two types of funding application forms:
|
|
- **QSM** (Qualitätssicherungsmittel) - Quality assurance funding
|
|
- **VSM** (Verfasste Studierendenschaft Mittel) - Student body funding
|
|
|
|
## LaTeX Templates
|
|
|
|
### Repository Structure
|
|
|
|
The LaTeX templates are maintained in a separate Git repository and integrated as a submodule:
|
|
- Repository: `git@git.beimgraben.net:frederik/PA_Vorlage.git`
|
|
- Location: `/backend/latex-templates/`
|
|
|
|
### Branch Organization
|
|
|
|
Different form types are maintained in separate branches:
|
|
- `v1.2/QSM` - QSM application forms
|
|
- `v1.2/VSM` - VSM application forms
|
|
- `v1.2/VGL` - VGL forms (deprecated, not used)
|
|
|
|
### Working with Templates
|
|
|
|
The project uses Git worktrees to access multiple template versions simultaneously:
|
|
|
|
```bash
|
|
# Templates are checked out to:
|
|
/backend/latex-qsm/ # QSM templates (branch: v1.2/QSM)
|
|
/backend/latex-vsm/ # VSM templates (branch: v1.2/VSM)
|
|
```
|
|
|
|
## PDF Generation Workflow
|
|
|
|
### 1. Template Compilation
|
|
|
|
LaTeX templates are compiled to PDF using XeLaTeX:
|
|
|
|
```bash
|
|
# Build PDFs from LaTeX sources
|
|
./scripts/build-pdfs.sh
|
|
```
|
|
|
|
This script:
|
|
- Sets up Git worktrees for QSM and VSM branches
|
|
- Compiles LaTeX to PDF using `latexmk -xelatex`
|
|
- Copies generated PDFs to `/backend/assets/`
|
|
|
|
### 2. Form Field Mapping
|
|
|
|
The system uses pre-compiled PDFs with form fields:
|
|
- `/backend/assets/qsm.pdf` - QSM form template
|
|
- `/backend/assets/vsm.pdf` - VSM form template
|
|
|
|
These PDFs contain named form fields that correspond to the application data structure.
|
|
|
|
### 3. PDF Processing Pipeline
|
|
|
|
```
|
|
User uploads PDF → Parse form data → Store in database → Generate filled PDF
|
|
```
|
|
|
|
## LaTeX Template Structure
|
|
|
|
### Main Components
|
|
|
|
```
|
|
Main.tex # Main document file
|
|
Content/
|
|
├── 01_content.tex # Form content and fields
|
|
└── 99_glossary.tex # Glossary definitions
|
|
HSRTReport/ # Custom document class
|
|
TeX/
|
|
├── Preamble.tex # Package imports and settings
|
|
└── Settings/ # Configuration files
|
|
```
|
|
|
|
### Form Fields
|
|
|
|
LaTeX form fields are defined using custom commands:
|
|
|
|
```latex
|
|
\CustomTextFieldDefault{pa-project-name}{}{Projektname}{width=\linewidth}
|
|
\CustomChoiceMenuDefault{pa-course}{}{width=\linewidth,default=-}{-,INF,ESB,LS,TEC,TEX,NXT}
|
|
\CheckBox[name=pa-qsm-studierende,width=1em,height=1em]{}
|
|
```
|
|
|
|
Field naming convention: `pa-` prefix followed by the field identifier.
|
|
|
|
## Docker Integration
|
|
|
|
### Build Requirements
|
|
|
|
The Docker image includes all necessary LaTeX packages:
|
|
|
|
```dockerfile
|
|
RUN apt-get install -y \
|
|
texlive-full \
|
|
texlive-xetex \
|
|
texlive-lang-german \
|
|
texlive-fonts-extra \
|
|
latexmk \
|
|
git
|
|
```
|
|
|
|
### Environment Variables
|
|
|
|
```env
|
|
# PDF template paths
|
|
QSM_TEMPLATE=/app/assets/qsm.pdf
|
|
VSM_TEMPLATE=/app/assets/vsm.pdf
|
|
|
|
# LaTeX source paths
|
|
LATEX_QSM_PATH=/app/latex-qsm
|
|
LATEX_VSM_PATH=/app/latex-vsm
|
|
```
|
|
|
|
## Development Workflow
|
|
|
|
### 1. Modifying Templates
|
|
|
|
To modify PDF templates:
|
|
|
|
1. Navigate to the appropriate worktree:
|
|
```bash
|
|
cd backend/latex-qsm # or latex-vsm
|
|
```
|
|
|
|
2. Edit the LaTeX files
|
|
|
|
3. Build the PDF:
|
|
```bash
|
|
./scripts/build-pdfs.sh
|
|
```
|
|
|
|
4. Test the new PDF with the application
|
|
|
|
### 2. Adding New Form Fields
|
|
|
|
1. Add field definition in LaTeX:
|
|
```latex
|
|
\CustomTextFieldDefault{pa-new-field}{}{Field Label}{width=\linewidth}
|
|
```
|
|
|
|
2. Update the field mapping in `pdf_field_mapping.py`
|
|
|
|
3. Add corresponding database fields if needed
|
|
|
|
4. Rebuild the PDF template
|
|
|
|
### 3. Testing PDF Generation
|
|
|
|
```bash
|
|
# Test PDF generation in Docker
|
|
docker compose exec api python -c "
|
|
from pdf_filler import fill_pdf
|
|
# Test code here
|
|
"
|
|
```
|
|
|
|
## Field Mapping Reference
|
|
|
|
### Common Fields (Both QSM and VSM)
|
|
|
|
| LaTeX Field | Database Field | Description |
|
|
|-------------|----------------|-------------|
|
|
| `pa-applicant-type` | `applicantType` | Person or Institution |
|
|
| `pa-institution` | `institution` | Institution name |
|
|
| `pa-first-name` | `firstName` | Applicant first name |
|
|
| `pa-last-name` | `lastName` | Applicant last name |
|
|
| `pa-email` | `email` | Contact email |
|
|
| `pa-phone` | `phone` | Phone number |
|
|
| `pa-project-name` | `name` | Project name |
|
|
|
|
### QSM-Specific Fields
|
|
|
|
| LaTeX Field | Database Field | Description |
|
|
|-------------|----------------|-------------|
|
|
| `pa-qsm-*` | Various | QSM-specific checkboxes |
|
|
| `pa-cost-*` | `costs[].name/amountEur` | Cost positions |
|
|
|
|
### VSM-Specific Fields
|
|
|
|
| LaTeX Field | Database Field | Description |
|
|
|-------------|----------------|-------------|
|
|
| `pa-vsm-*` | Various | VSM-specific fields |
|
|
| `pa-financing-*` | `financing.*` | Financing options |
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
|
|
1. **PDF build fails with XeLaTeX error**
|
|
- Ensure all LaTeX dependencies are installed
|
|
- Check for syntax errors in .tex files
|
|
- Verify fonts are available
|
|
|
|
2. **Form fields not filling**
|
|
- Check field names match between LaTeX and mapping
|
|
- Verify PDF has form fields (use PDF reader)
|
|
- Check data types match expected format
|
|
|
|
3. **Git worktree errors**
|
|
- Remove existing worktrees: `git worktree prune`
|
|
- Re-run setup script
|
|
|
|
### Debugging Commands
|
|
|
|
```bash
|
|
# List form fields in PDF
|
|
docker compose exec api python -c "
|
|
import PyPDF2
|
|
with open('/app/assets/qsm.pdf', 'rb') as f:
|
|
pdf = PyPDF2.PdfReader(f)
|
|
fields = pdf.get_form_text_fields()
|
|
for name, value in fields.items():
|
|
print(f'{name}: {value}')
|
|
"
|
|
|
|
# Check LaTeX compilation log
|
|
cd backend/latex-qsm
|
|
cat Main.log
|
|
```
|
|
|
|
## Best Practices
|
|
|
|
1. **Version Control**
|
|
- Keep LaTeX templates in sync with main repo
|
|
- Tag releases when updating PDF templates
|
|
- Document field changes in commit messages
|
|
|
|
2. **Testing**
|
|
- Test both empty and filled PDFs
|
|
- Verify all form fields are accessible
|
|
- Check PDF compatibility across readers
|
|
|
|
3. **Performance**
|
|
- Pre-compile PDFs rather than generating on demand
|
|
- Cache compiled PDFs in Docker image
|
|
- Minimize LaTeX package dependencies
|
|
|
|
## Future Enhancements
|
|
|
|
1. **Dynamic PDF Generation**
|
|
- Generate PDFs on-demand from LaTeX
|
|
- Support custom form layouts
|
|
- Template versioning system
|
|
|
|
2. **Field Validation**
|
|
- Implement LaTeX-side validation
|
|
- Sync validation rules with frontend
|
|
- Generate field documentation from LaTeX
|
|
|
|
3. **Multi-language Support**
|
|
- Internationalize LaTeX templates
|
|
- Support multiple PDF languages
|
|
- Dynamic language selection |