# PDF Integration Guide This document describes how the STUPA PDF API integrates with LaTeX templates to generate and process PDF forms. ## Overview The STUPA PDF API system works with two types of funding application forms: - **QSM** (Qualitätssicherungsmittel) - Quality assurance funding - **VSM** (Verfasste Studierendenschaft Mittel) - Student body funding ## LaTeX Templates ### Repository Structure The LaTeX templates are maintained in a separate Git repository and integrated as a submodule: - Repository: `git@git.beimgraben.net:frederik/PA_Vorlage.git` - Location: `/backend/latex-templates/` ### Branch Organization Different form types are maintained in separate branches: - `v1.2/QSM` - QSM application forms - `v1.2/VSM` - VSM application forms - `v1.2/VGL` - VGL forms (deprecated, not used) ### Working with Templates The project uses Git worktrees to access multiple template versions simultaneously: ```bash # Templates are checked out to: /backend/latex-qsm/ # QSM templates (branch: v1.2/QSM) /backend/latex-vsm/ # VSM templates (branch: v1.2/VSM) ``` ## PDF Generation Workflow ### 1. Template Compilation LaTeX templates are compiled to PDF using XeLaTeX: ```bash # Build PDFs from LaTeX sources ./scripts/build-pdfs.sh ``` This script: - Sets up Git worktrees for QSM and VSM branches - Compiles LaTeX to PDF using `latexmk -xelatex` - Copies generated PDFs to `/backend/assets/` ### 2. Form Field Mapping The system uses pre-compiled PDFs with form fields: - `/backend/assets/qsm.pdf` - QSM form template - `/backend/assets/vsm.pdf` - VSM form template These PDFs contain named form fields that correspond to the application data structure. ### 3. PDF Processing Pipeline ``` User uploads PDF → Parse form data → Store in database → Generate filled PDF ``` ## LaTeX Template Structure ### Main Components ``` Main.tex # Main document file Content/ ├── 01_content.tex # Form content and fields └── 99_glossary.tex # Glossary definitions HSRTReport/ # Custom document class TeX/ ├── Preamble.tex # Package imports and settings └── Settings/ # Configuration files ``` ### Form Fields LaTeX form fields are defined using custom commands: ```latex \CustomTextFieldDefault{pa-project-name}{}{Projektname}{width=\linewidth} \CustomChoiceMenuDefault{pa-course}{}{width=\linewidth,default=-}{-,INF,ESB,LS,TEC,TEX,NXT} \CheckBox[name=pa-qsm-studierende,width=1em,height=1em]{} ``` Field naming convention: `pa-` prefix followed by the field identifier. ## Docker Integration ### Build Requirements The Docker image includes all necessary LaTeX packages: ```dockerfile RUN apt-get install -y \ texlive-full \ texlive-xetex \ texlive-lang-german \ texlive-fonts-extra \ latexmk \ git ``` ### Environment Variables ```env # PDF template paths QSM_TEMPLATE=/app/assets/qsm.pdf VSM_TEMPLATE=/app/assets/vsm.pdf # LaTeX source paths LATEX_QSM_PATH=/app/latex-qsm LATEX_VSM_PATH=/app/latex-vsm ``` ## Development Workflow ### 1. Modifying Templates To modify PDF templates: 1. Navigate to the appropriate worktree: ```bash cd backend/latex-qsm # or latex-vsm ``` 2. Edit the LaTeX files 3. Build the PDF: ```bash ./scripts/build-pdfs.sh ``` 4. Test the new PDF with the application ### 2. Adding New Form Fields 1. Add field definition in LaTeX: ```latex \CustomTextFieldDefault{pa-new-field}{}{Field Label}{width=\linewidth} ``` 2. Update the field mapping in `pdf_field_mapping.py` 3. Add corresponding database fields if needed 4. Rebuild the PDF template ### 3. Testing PDF Generation ```bash # Test PDF generation in Docker docker compose exec api python -c " from pdf_filler import fill_pdf # Test code here " ``` ## Field Mapping Reference ### Common Fields (Both QSM and VSM) | LaTeX Field | Database Field | Description | |-------------|----------------|-------------| | `pa-applicant-type` | `applicantType` | Person or Institution | | `pa-institution` | `institution` | Institution name | | `pa-first-name` | `firstName` | Applicant first name | | `pa-last-name` | `lastName` | Applicant last name | | `pa-email` | `email` | Contact email | | `pa-phone` | `phone` | Phone number | | `pa-project-name` | `name` | Project name | ### QSM-Specific Fields | LaTeX Field | Database Field | Description | |-------------|----------------|-------------| | `pa-qsm-*` | Various | QSM-specific checkboxes | | `pa-cost-*` | `costs[].name/amountEur` | Cost positions | ### VSM-Specific Fields | LaTeX Field | Database Field | Description | |-------------|----------------|-------------| | `pa-vsm-*` | Various | VSM-specific fields | | `pa-financing-*` | `financing.*` | Financing options | ## Troubleshooting ### Common Issues 1. **PDF build fails with XeLaTeX error** - Ensure all LaTeX dependencies are installed - Check for syntax errors in .tex files - Verify fonts are available 2. **Form fields not filling** - Check field names match between LaTeX and mapping - Verify PDF has form fields (use PDF reader) - Check data types match expected format 3. **Git worktree errors** - Remove existing worktrees: `git worktree prune` - Re-run setup script ### Debugging Commands ```bash # List form fields in PDF docker compose exec api python -c " import PyPDF2 with open('/app/assets/qsm.pdf', 'rb') as f: pdf = PyPDF2.PdfReader(f) fields = pdf.get_form_text_fields() for name, value in fields.items(): print(f'{name}: {value}') " # Check LaTeX compilation log cd backend/latex-qsm cat Main.log ``` ## Best Practices 1. **Version Control** - Keep LaTeX templates in sync with main repo - Tag releases when updating PDF templates - Document field changes in commit messages 2. **Testing** - Test both empty and filled PDFs - Verify all form fields are accessible - Check PDF compatibility across readers 3. **Performance** - Pre-compile PDFs rather than generating on demand - Cache compiled PDFs in Docker image - Minimize LaTeX package dependencies ## Future Enhancements 1. **Dynamic PDF Generation** - Generate PDFs on-demand from LaTeX - Support custom form layouts - Template versioning system 2. **Field Validation** - Implement LaTeX-side validation - Sync validation rules with frontend - Generate field documentation from LaTeX 3. **Multi-language Support** - Internationalize LaTeX templates - Support multiple PDF languages - Dynamic language selection