Add complete Mail Fine-Tuning Web-App for macOS Apple Silicon

Implemented a full-stack web application for fine-tuning LLMs on email data, optimized for Apple Silicon (M4 Pro with 24GB RAM). Features: - Mail import with drag & drop support (.mbox, .eml, .txt) - Automated mail cleaning and preprocessing - Interactive labeling interface with keyboard shortcuts - Training data export to JSONL format - MLX-based LoRA fine-tuning with live updates - Model evaluation and comparison interface - Server-Sent Events for real-time training progress - Dark theme UI optimized for extended use Technical Stack: - Backend: FastAPI with SQLite database - Frontend: Vanilla HTML/CSS/JavaScript (no external dependencies) - ML Framework: MLX for Apple Silicon optimization - Models: Support for Mistral 7B and Llama 3 8B via MLX Components: - data_manager.py: SQLite operations for mail storage and labeling - mail_parser.py: Parser for multiple mail formats with cleaning - training.py: MLX training wrapper with LoRA support - inference.py: Model loading and inference for evaluation - main.py: FastAPI backend with REST API and SSE - Frontend: Complete UI with all features Documentation: - Comprehensive README with installation and usage guide - Quick-start guide for rapid setup - Example mails for testing - Troubleshooting and best practices Ready for local deployment and fine-tuning workflows.
2025-12-03 07:35:35 +00:00
commit 1456995462
20 changed files with 3884 additions and 0 deletions
@@ -0,0 +1,36 @@
+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+venv/
+env/
+ENV/
+
+# Data
+data/*.db
+data/*.jsonl
+data/temp/
+
+# Models
+models/*
+!models/.gitkeep
+
+# Training outputs
+output/*
+!output/.gitkeep
+
+# IDE
+.vscode/
+.idea/
+*.swp
+*.swo
+*~
+
+# OS
+.DS_Store
+Thumbs.db
+
+# Logs
+*.log
@@ -0,0 +1,209 @@
+# Quick Start Guide
+
+Schnellstart-Anleitung für die Mail Fine-Tuning App.
+
+## 1. Installation (5 Minuten)
+
+```bash
+# 1. Virtual Environment erstellen
+python3 -m venv venv
+source venv/bin/activate
+
+# 2. Dependencies installieren
+pip install -r requirements.txt
+
+# 3. Modell herunterladen (ca. 4GB, dauert je nach Internetverbindung)
+huggingface-cli download mlx-community/Mistral-7B-Instruct-v0.3-4bit \
+    --local-dir models/Mistral-7B-Instruct-v0.3-4bit
+```
+
+## 2. Server starten
+
+```bash
+./start.sh
+```
+
+Oder manuell:
+
+```bash
+source venv/bin/activate
+cd backend
+python main.py
+```
+
+App öffnen: **http://localhost:8000**
+
+## 3. Erste Schritte (10 Minuten)
+
+### Schritt 1: Test-Mails erstellen
+
+Erstelle eine Datei `test.txt` mit einer Beispiel-Mail:
+
+```
+Subject: Projekt Update
+From: max@example.com
+To: team@example.com
+
+Hallo Team,
+
+das neue Feature ist fertig und bereit für Testing.
+Ich habe die API-Integration abgeschlossen und alle Tests laufen durch.
+
+Bitte reviewt den Code bis Freitag.
+
+Grüße
+Max
+```
+
+### Schritt 2: Mails importieren
+
+1. Öffne http://localhost:8000
+2. Ziehe `test.txt` in den Upload-Bereich
+3. Mail erscheint in der Liste
+
+### Schritt 3: Erste Mail labeln
+
+1. Klicke auf "Labeling" in der Sidebar
+2. Wähle **Aufgabentyp**: "Zusammenfassen"
+3. Gib **erwarteten Output** ein:
+   ```
+   Max hat das neue Feature fertiggestellt und alle Tests sind erfolgreich.
+   Das Team soll den Code bis Freitag reviewen.
+   ```
+4. Klicke "Speichern" (oder drücke `S`)
+
+### Schritt 4: Mehr Mails labeln
+
+- Erstelle mindestens **20-50 Beispiel-Mails**
+- Nutze verschiedene Typen:
+  - Zusammenfassen
+  - Antwort schreiben
+  - Action Items extrahieren
+- Nutze Shortcuts: `N` (Nächste), `S` (Speichern)
+
+### Schritt 5: Statistiken prüfen
+
+1. Gehe zu "Export & Stats"
+2. Prüfe:
+   - Mind. 50 gelabelte Mails? ✅
+   - Gute Verteilung der Task-Types? ✅
+
+### Schritt 6: Training starten
+
+1. Gehe zu "Training"
+2. Wähle dein Modell aus
+3. Nutze Standard-Einstellungen:
+   - Learning Rate: 1e-5
+   - Epochs: 3
+   - Batch Size: 4
+   - LoRA Rank: 8
+4. Klicke "Training starten"
+5. Beobachte Live-Updates
+
+⏱️ **Training dauert**: Ca. 5-10 Minuten bei 50 Beispielen
+
+### Schritt 7: Modell testen
+
+1. Gehe zu "Evaluation"
+2. Klicke "Test-Beispiel laden"
+3. Klicke "Vergleich starten"
+4. Vergleiche Base- vs. Fine-tuned-Ausgabe
+
+## Tipps
+
+### Gute Trainingsdaten
+
+✅ **DO**:
+- Mindestens 50 Beispiele
+- Konsistenter Output-Stil
+- Diverse Mail-Typen
+- Klare, eindeutige Labels
+
+❌ **DON'T**:
+- Zu wenige Beispiele (<20)
+- Widersprüchliche Labels
+- Nur sehr ähnliche Mails
+- Zu lange Outputs (>500 Wörter)
+
+### Training-Parameter
+
+Für **erste Versuche**:
+- Learning Rate: **1e-5**
+- Epochs: **3**
+- Batch Size: **4**
+- LoRA Rank: **8**
+
+Bei **Overfitting** (Val Loss steigt):
+- Learning Rate: **5e-6** (niedriger)
+- Epochs: **2** (weniger)
+
+Bei **Underfitting** (beide Losses hoch):
+- Epochs: **5** (mehr)
+- LoRA Rank: **16** (höher)
+- Mehr Daten sammeln!
+
+### Keyboard Shortcuts
+
+Im Labeling-Interface:
+- `N` - Nächste Mail
+- `S` - Speichern
+- `K` - Skip (Überspringen)
+
+## Troubleshooting
+
+### Server startet nicht
+
+```bash
+# Prüfe Python-Version (mind. 3.10)
+python3 --version
+
+# Prüfe ob Port 8000 frei ist
+lsof -i :8000
+
+# Nutze anderen Port
+uvicorn main:app --port 8001
+```
+
+### Modell nicht gefunden
+
+```bash
+# Prüfe ob Modell existiert
+ls -la models/
+
+# Download nochmal versuchen
+huggingface-cli download mlx-community/Mistral-7B-Instruct-v0.3-4bit \
+    --local-dir models/Mistral-7B-Instruct-v0.3-4bit
+```
+
+### Out of Memory
+
+Reduziere Batch Size:
+1. Gehe zu "Training"
+2. Setze Batch Size auf **2** oder **1**
+
+### Training sehr langsam
+
+- Nutze 4-bit quantisierte Modelle
+- Reduziere Batch Size
+- Schließe andere Programme
+
+## Nächste Schritte
+
+Nach erfolgreichem ersten Training:
+
+1. **Mehr Daten sammeln**: 100+ Beispiele für bessere Ergebnisse
+2. **Parameter tunen**: Experimentiere mit Learning Rate und Epochs
+3. **Verschiedene Tasks**: Probiere alle Task-Types aus
+4. **Evaluation**: Teste ausgiebig mit neuen Mails
+
+## Ressourcen
+
+- Vollständige Doku: [README.md](README.md)
+- MLX Doku: https://ml-explore.github.io/mlx/
+- MLX-LM: https://github.com/ml-explore/mlx-examples
+
+---
+
+**Viel Erfolg! 🚀**
+
+Bei Fragen schaue ins vollständige README oder die API-Dokumentation.
@@ -0,0 +1,326 @@
+# Mail Fine-Tuning Web-App für macOS (Apple Silicon)
+
+Eine vollständige lokale Web-Anwendung für das Fine-Tuning von LLMs auf Mail-Daten, optimiert für Apple Silicon (M4 Pro mit 24GB RAM).
+
+## Features
+
+- 📥 **Mail Import**: Drag & Drop Upload von .mbox, .eml, .txt Dateien mit automatischer Bereinigung
+- 🏷️ **Labeling Interface**: Komfortable UI zum manuellen Labeln von Mails
+- 📊 **Export & Statistiken**: JSONL Export für Training mit detaillierten Statistiken
+- 🤖 **Modell-Management**: Verwaltung von MLX-Modellen
+- 🎯 **Training**: LoRA Fine-Tuning mit Live-Updates und Visualisierung
+- 🧪 **Evaluation**: Chat-Interface mit Vergleichsmodus (Base vs. Fine-tuned)
+
+## Technologie-Stack
+
+- **Backend**: Python (FastAPI)
+- **Frontend**: HTML/CSS/JavaScript (Vanilla, keine Dependencies)
+- **ML Framework**: MLX (Apple Silicon optimiert)
+- **Database**: SQLite
+- **Empfohlene Modelle**: Mistral 7B, Llama 3 8B (via MLX)
+
+## Projektstruktur
+
+```
+mail-finetuning/
+├── backend/
+│   ├── main.py              # FastAPI App
+│   ├── mail_parser.py       # Mail Import & Bereinigung
+│   ├── data_manager.py      # SQLite Operationen
+│   ├── training.py          # MLX Training Wrapper
+│   └── inference.py         # Modell-Inferenz
+├── frontend/
+│   ├── index.html
+│   ├── style.css
+│   └── app.js
+├── data/
+│   ├── mails.db             # SQLite Datenbank
+│   ├── train.jsonl
+│   └── val.jsonl
+├── models/                  # Heruntergeladene Modelle
+├── output/                  # Trainierte Adapter
+└── requirements.txt
+```
+
+## Installation
+
+### Voraussetzungen
+
+- macOS mit Apple Silicon (M1/M2/M3/M4)
+- Python 3.10 oder höher
+- mindestens 16GB RAM (24GB empfohlen)
+
+### 1. Repository Setup
+
+```bash
+cd training
+```
+
+### 2. Virtual Environment erstellen
+
+```bash
+python3 -m venv venv
+source venv/bin/activate
+```
+
+### 3. Dependencies installieren
+
+```bash
+pip install -r requirements.txt
+```
+
+### 4. Modell herunterladen
+
+Wähle ein MLX-optimiertes Modell von Hugging Face:
+
+```bash
+# Mistral 7B (4-bit quantisiert, ~4GB)
+huggingface-cli download mlx-community/Mistral-7B-Instruct-v0.3-4bit \
+    --local-dir models/Mistral-7B-Instruct-v0.3-4bit
+
+# ODER Llama 3 8B (4-bit quantisiert, ~5GB)
+huggingface-cli download mlx-community/Meta-Llama-3-8B-Instruct-4bit \
+    --local-dir models/Meta-Llama-3-8B-Instruct-4bit
+```
+
+**Hinweis**: Die 4-bit Versionen sind für 24GB RAM optimal. Für mehr RAM können auch größere Versionen genutzt werden.
+
+## Nutzung
+
+### 1. Server starten
+
+```bash
+cd backend
+python main.py
+```
+
+Die App ist dann verfügbar unter: **http://localhost:8000**
+
+### 2. Workflow
+
+#### Schritt 1: Mails importieren
+
+1. Gehe zu "Mail Import"
+2. Ziehe .eml, .mbox oder .txt Dateien per Drag & Drop in den Upload-Bereich
+3. Die Mails werden automatisch geparst und bereinigt
+
+#### Schritt 2: Mails labeln
+
+1. Wechsle zu "Labeling"
+2. Für jede Mail:
+   - Wähle den **Aufgabentyp** (Zusammenfassen, Antwort schreiben, etc.)
+   - Gib den **erwarteten Output** ein
+   - Klicke "Speichern" oder nutze Shortcut `S`
+3. Nutze Shortcuts: `N` (Nächste), `S` (Speichern), `K` (Skip)
+
+**Tipp**: Mindestens 50 gelabelte Beispiele für gutes Fine-Tuning!
+
+#### Schritt 3: Daten exportieren
+
+1. Gehe zu "Export & Stats"
+2. Prüfe die Statistiken (mind. 50 gelabelte Mails empfohlen)
+3. Klicke "JSONL generieren"
+4. Optional: Download der JSONL-Dateien zur Archivierung
+
+#### Schritt 4: Training starten
+
+1. Wechsle zu "Training"
+2. Konfiguriere Parameter:
+   - **Modell**: Wähle heruntergeladenes Modell
+   - **Learning Rate**: Standard 1e-5 (bei Overfitting niedriger)
+   - **Epochs**: 3-5 für erste Versuche
+   - **Batch Size**: 4 (bei 24GB RAM sicher)
+   - **LoRA Rank**: 8-16 (höher = mehr Kapazität, mehr RAM)
+3. Klicke "Training starten"
+4. Beobachte Live-Updates:
+   - Training/Validation Loss
+   - Fortschritt und ETA
+   - Speichernutzung
+
+**Warnung bei Overfitting**: Wenn Validation Loss steigt während Training Loss sinkt, Training abbrechen!
+
+#### Schritt 5: Modell testen
+
+1. Gehe zu "Evaluation"
+2. Wähle Task-Type und gib Mail-Text ein
+3. Klicke "Vergleich starten"
+4. Sieh dir die Ausgaben von Base- und Fine-tuned-Modell an
+
+### 3. Export des fertigen Modells
+
+Nach erfolgreichem Training liegen die LoRA-Adapter in `output/run_[timestamp]/adapters.npz`.
+
+Um das Modell zu nutzen:
+
+```python
+from mlx_lm import load
+
+model = load(
+    "models/Mistral-7B-Instruct-v0.3-4bit",
+    adapter_path="output/run_1234567890/adapters.npz"
+)
+```
+
+## API Endpoints
+
+### Mails
+
+- `POST /api/mails/upload` - Mails hochladen
+- `GET /api/mails` - Alle Mails abrufen
+- `GET /api/mails/{id}` - Einzelne Mail
+- `PUT /api/mails/{id}` - Mail aktualisieren (Labeling)
+- `DELETE /api/mails/{id}` - Mail löschen
+
+### Export
+
+- `GET /api/export/stats` - Statistiken
+- `POST /api/export/jsonl` - Training-Daten generieren
+- `GET /api/export/download/{train|val}` - JSONL herunterladen
+
+### Modelle
+
+- `GET /api/models` - Verfügbare Modelle
+- `POST /api/models/download` - Modell herunterladen (Placeholder)
+
+### Training
+
+- `POST /api/training/start` - Training starten
+- `POST /api/training/stop` - Training stoppen
+- `GET /api/training/status` - Status abrufen
+- `GET /api/training/stream` - SSE Stream für Live-Updates
+
+### Inference
+
+- `POST /api/inference/load` - Modell laden
+- `GET /api/inference/loaded` - Geladene Modelle
+- `POST /api/inference/generate` - Text generieren
+- `POST /api/inference/compare` - Modell-Vergleich
+- `GET /api/inference/test-prompts` - Test-Prompts
+
+## Tipps & Best Practices
+
+### Datenqualität
+
+- **Mindestens 50 Beispiele** pro Task-Type
+- **Einheitlicher Output-Stil**: Achte auf konsistente Formatierung
+- **Diverse Beispiele**: Verschiedene Mail-Längen und Stile
+- **Klare Labels**: Vermeide mehrdeutige oder widersprüchliche Labels
+
+### Training
+
+- **Learning Rate**:
+  - 1e-5 für die meisten Fälle
+  - 5e-6 bei Overfitting
+  - 1e-4 bei sehr kleinem Datensatz (Vorsicht!)
+
+- **Epochs**:
+  - 3 Epochs für Start
+  - Mehr Epochs wenn Loss noch sinkt
+  - Weniger wenn Overfitting auftritt
+
+- **LoRA Rank**:
+  - 8 für einfache Tasks
+  - 16-32 für komplexe Tasks
+  - Höher = mehr Kapazität aber mehr RAM
+
+### Overfitting erkennen
+
+Zeichen von Overfitting:
+- ✅ Training Loss sinkt kontinuierlich
+- ❌ Validation Loss steigt oder stagniert
+- ❌ Modell "memoriert" exakte Trainingsbeispiele
+
+Lösungen:
+- Mehr Daten sammeln
+- Kleinere Learning Rate
+- Weniger Epochs
+- Niedrigere LoRA Rank
+
+## Troubleshooting
+
+### "Out of Memory" Fehler
+
+- Reduziere Batch Size (4 → 2 → 1)
+- Nutze kleineres Modell (4-bit quantisiert)
+- Schließe andere Programme
+
+### Training sehr langsam
+
+- Prüfe ob Metal Performance Shaders aktiv sind
+- Nutze 4-bit quantisierte Modelle
+- Reduziere max_seq_length (Standard: 2048)
+
+### Modell gibt schlechte Ergebnisse
+
+- Mehr/bessere Trainingsdaten
+- Längeres Training (mehr Epochs)
+- Höhere LoRA Rank
+- Prüfe Prompt-Format
+
+## Wichtige Hinweise
+
+### MLX Training Loop
+
+**WICHTIG**: Die aktuelle Implementierung in `training.py` enthält eine **simulierte Training Loop**. Für produktiven Einsatz muss diese durch echtes MLX Training ersetzt werden:
+
+```python
+# Beispiel für echtes MLX Training mit mlx-lm
+from mlx_lm.tuner import train
+
+train(
+    model_path=str(model_path),
+    data_path=str(train_file),
+    val_data_path=str(val_file),
+    adapter_file=str(output_path / 'adapters.npz'),
+    iters=total_steps,
+    learning_rate=config.learning_rate,
+    batch_size=config.batch_size,
+    # ... weitere Parameter
+)
+```
+
+Siehe [mlx-lm Dokumentation](https://github.com/ml-explore/mlx-examples/tree/main/llms) für Details.
+
+### Inference
+
+Die Inference-Implementation in `inference.py` nutzt `mlx_lm.generate()`. Stelle sicher, dass das richtige Prompt-Format für dein Modell genutzt wird (z.B. ChatML, Llama-Format, etc.).
+
+## Entwicklung
+
+### Debug-Modus
+
+```bash
+uvicorn main:app --reload --log-level debug
+```
+
+### Tests (TODO)
+
+```bash
+pytest tests/
+```
+
+## Lizenz
+
+MIT License
+
+## Support
+
+Bei Problemen:
+1. Prüfe die Browser Console (F12) für Frontend-Fehler
+2. Prüfe die Server-Logs für Backend-Fehler
+3. Stelle sicher, dass alle Dependencies installiert sind
+4. Prüfe, dass MLX korrekt auf Apple Silicon läuft
+
+## Roadmap
+
+- [ ] Echte MLX Training Loop implementieren
+- [ ] Automatisches Checkpoint-Management
+- [ ] Model Merging (Base + Adapter zusammenführen)
+- [ ] Export für Deployment
+- [ ] Batch-Inference
+- [ ] Tests
+- [ ] Docker Support
+
+---
+
+**Viel Erfolg beim Fine-Tuning! 🚀**
@@ -0,0 +1,286 @@
+"""
+Data Manager für Mail Fine-Tuning App
+Verwaltet SQLite Datenbank für Mails und Labels
+"""
+
+import sqlite3
+import json
+from datetime import datetime
+from typing import List, Dict, Optional
+from pathlib import Path
+
+
+class DataManager:
+    def __init__(self, db_path: str = "data/mails.db"):
+        self.db_path = Path(db_path)
+        self.db_path.parent.mkdir(parents=True, exist_ok=True)
+        self.init_db()
+
+    def init_db(self):
+        """Initialisiert die Datenbank mit dem Schema"""
+        conn = sqlite3.connect(self.db_path)
+        cursor = conn.cursor()
+
+        cursor.execute("""
+            CREATE TABLE IF NOT EXISTS mails (
+                id INTEGER PRIMARY KEY AUTOINCREMENT,
+                subject TEXT,
+                sender TEXT,
+                recipient TEXT,
+                date TEXT,
+                body TEXT NOT NULL,
+                original_format TEXT,
+                task_type TEXT DEFAULT 'unlabeled',
+                expected_output TEXT,
+                status TEXT DEFAULT 'unlabeled',
+                created_at TEXT DEFAULT CURRENT_TIMESTAMP,
+                updated_at TEXT DEFAULT CURRENT_TIMESTAMP
+            )
+        """)
+
+        cursor.execute("""
+            CREATE TABLE IF NOT EXISTS training_runs (
+                id INTEGER PRIMARY KEY AUTOINCREMENT,
+                model_name TEXT NOT NULL,
+                start_time TEXT,
+                end_time TEXT,
+                config TEXT,
+                status TEXT,
+                final_train_loss REAL,
+                final_val_loss REAL,
+                checkpoint_path TEXT
+            )
+        """)
+
+        conn.commit()
+        conn.close()
+
+    def add_mail(self, subject: str, sender: str, recipient: str,
+                 date: str, body: str, original_format: str) -> int:
+        """Fügt eine neue Mail hinzu"""
+        conn = sqlite3.connect(self.db_path)
+        cursor = conn.cursor()
+
+        cursor.execute("""
+            INSERT INTO mails (subject, sender, recipient, date, body, original_format)
+            VALUES (?, ?, ?, ?, ?, ?)
+        """, (subject, sender, recipient, date, body, original_format))
+
+        mail_id = cursor.lastrowid
+        conn.commit()
+        conn.close()
+
+        return mail_id
+
+    def get_all_mails(self, status_filter: Optional[str] = None) -> List[Dict]:
+        """Holt alle Mails, optional gefiltert nach Status"""
+        conn = sqlite3.connect(self.db_path)
+        conn.row_factory = sqlite3.Row
+        cursor = conn.cursor()
+
+        if status_filter:
+            cursor.execute("SELECT * FROM mails WHERE status = ? ORDER BY id", (status_filter,))
+        else:
+            cursor.execute("SELECT * FROM mails ORDER BY id")
+
+        rows = cursor.fetchall()
+        mails = [dict(row) for row in rows]
+
+        conn.close()
+        return mails
+
+    def get_mail(self, mail_id: int) -> Optional[Dict]:
+        """Holt eine einzelne Mail"""
+        conn = sqlite3.connect(self.db_path)
+        conn.row_factory = sqlite3.Row
+        cursor = conn.cursor()
+
+        cursor.execute("SELECT * FROM mails WHERE id = ?", (mail_id,))
+        row = cursor.fetchone()
+
+        conn.close()
+        return dict(row) if row else None
+
+    def update_mail(self, mail_id: int, task_type: Optional[str] = None,
+                   expected_output: Optional[str] = None,
+                   status: Optional[str] = None,
+                   body: Optional[str] = None) -> bool:
+        """Aktualisiert eine Mail (Labeling)"""
+        conn = sqlite3.connect(self.db_path)
+        cursor = conn.cursor()
+
+        updates = []
+        params = []
+
+        if task_type is not None:
+            updates.append("task_type = ?")
+            params.append(task_type)
+
+        if expected_output is not None:
+            updates.append("expected_output = ?")
+            params.append(expected_output)
+
+        if status is not None:
+            updates.append("status = ?")
+            params.append(status)
+
+        if body is not None:
+            updates.append("body = ?")
+            params.append(body)
+
+        if not updates:
+            conn.close()
+            return False
+
+        updates.append("updated_at = ?")
+        params.append(datetime.now().isoformat())
+        params.append(mail_id)
+
+        query = f"UPDATE mails SET {', '.join(updates)} WHERE id = ?"
+        cursor.execute(query, params)
+
+        success = cursor.rowcount > 0
+        conn.commit()
+        conn.close()
+
+        return success
+
+    def delete_mail(self, mail_id: int) -> bool:
+        """Löscht eine Mail"""
+        conn = sqlite3.connect(self.db_path)
+        cursor = conn.cursor()
+
+        cursor.execute("DELETE FROM mails WHERE id = ?", (mail_id,))
+        success = cursor.rowcount > 0
+
+        conn.commit()
+        conn.close()
+
+        return success
+
+    def get_statistics(self) -> Dict:
+        """Berechnet Statistiken über die Daten"""
+        conn = sqlite3.connect(self.db_path)
+        cursor = conn.cursor()
+
+        # Gesamt-Anzahl
+        cursor.execute("SELECT COUNT(*) FROM mails")
+        total = cursor.fetchone()[0]
+
+        # Nach Status
+        cursor.execute("""
+            SELECT status, COUNT(*) as count
+            FROM mails
+            GROUP BY status
+        """)
+        status_counts = {row[0]: row[1] for row in cursor.fetchall()}
+
+        # Nach Task-Type
+        cursor.execute("""
+            SELECT task_type, COUNT(*) as count
+            FROM mails
+            WHERE status = 'labeled'
+            GROUP BY task_type
+        """)
+        task_counts = {row[0]: row[1] for row in cursor.fetchall()}
+
+        # Durchschnittliche Längen (nur gelabelte)
+        cursor.execute("""
+            SELECT
+                AVG(LENGTH(body)) as avg_input_length,
+                AVG(LENGTH(expected_output)) as avg_output_length
+            FROM mails
+            WHERE status = 'labeled'
+        """)
+        lengths = cursor.fetchone()
+
+        conn.close()
+
+        labeled_count = status_counts.get('labeled', 0)
+
+        return {
+            'total': total,
+            'labeled': labeled_count,
+            'unlabeled': status_counts.get('unlabeled', 0),
+            'skipped': status_counts.get('skip', 0),
+            'task_distribution': task_counts,
+            'avg_input_length': round(lengths[0]) if lengths[0] else 0,
+            'avg_output_length': round(lengths[1]) if lengths[1] else 0,
+            'sufficient_data': labeled_count >= 50
+        }
+
+    def export_training_data(self, train_split: float = 0.9) -> tuple[List[Dict], List[Dict]]:
+        """Exportiert gelabelte Daten für Training"""
+        import random
+
+        conn = sqlite3.connect(self.db_path)
+        conn.row_factory = sqlite3.Row
+        cursor = conn.cursor()
+
+        cursor.execute("""
+            SELECT body, task_type, expected_output
+            FROM mails
+            WHERE status = 'labeled' AND expected_output IS NOT NULL
+            ORDER BY RANDOM()
+        """)
+
+        rows = cursor.fetchall()
+        conn.close()
+
+        if not rows:
+            return [], []
+
+        data = [dict(row) for row in rows]
+
+        # Shuffle
+        random.shuffle(data)
+
+        # Split
+        split_idx = int(len(data) * train_split)
+        train_data = data[:split_idx]
+        val_data = data[split_idx:]
+
+        return train_data, val_data
+
+    def save_training_run(self, model_name: str, config: Dict,
+                         checkpoint_path: str) -> int:
+        """Speichert einen Training-Run"""
+        conn = sqlite3.connect(self.db_path)
+        cursor = conn.cursor()
+
+        cursor.execute("""
+            INSERT INTO training_runs
+            (model_name, start_time, config, status, checkpoint_path)
+            VALUES (?, ?, ?, ?, ?)
+        """, (
+            model_name,
+            datetime.now().isoformat(),
+            json.dumps(config),
+            'running',
+            checkpoint_path
+        ))
+
+        run_id = cursor.lastrowid
+        conn.commit()
+        conn.close()
+
+        return run_id
+
+    def update_training_run(self, run_id: int, status: str,
+                          train_loss: Optional[float] = None,
+                          val_loss: Optional[float] = None):
+        """Aktualisiert einen Training-Run"""
+        conn = sqlite3.connect(self.db_path)
+        cursor = conn.cursor()
+
+        cursor.execute("""
+            UPDATE training_runs
+            SET status = ?,
+                end_time = ?,
+                final_train_loss = COALESCE(?, final_train_loss),
+                final_val_loss = COALESCE(?, final_val_loss)
+            WHERE id = ?
+        """, (status, datetime.now().isoformat(), train_loss, val_loss, run_id))
+
+        conn.commit()
+        conn.close()
@@ -0,0 +1,209 @@
+"""
+Inference Module für Modell-Evaluation
+Lädt Base- und Fine-tuned Models für Vergleiche
+"""
+
+from pathlib import Path
+from typing import Optional, Dict
+import threading
+
+
+class ModelInference:
+    """Handhabt Modell-Inferenz für Base und Fine-tuned Models"""
+
+    def __init__(self, models_dir: str = "models", output_dir: str = "output"):
+        self.models_dir = Path(models_dir)
+        self.output_dir = Path(output_dir)
+
+        self.base_model = None
+        self.finetuned_model = None
+        self.model_lock = threading.Lock()
+
+    def load_base_model(self, model_name: str) -> bool:
+        """Lädt das Basis-Modell"""
+        try:
+            # Import MLX nur bei Bedarf
+            from mlx_lm import load
+
+            model_path = self.models_dir / model_name
+
+            if not model_path.exists():
+                return False
+
+            with self.model_lock:
+                self.base_model = load(str(model_path))
+
+            return True
+
+        except Exception as e:
+            print(f"Error loading base model: {e}")
+            return False
+
+    def load_finetuned_model(self, model_name: str, adapter_path: str) -> bool:
+        """Lädt das Fine-tuned Modell (Base + LoRA Adapter)"""
+        try:
+            from mlx_lm import load
+
+            model_path = self.models_dir / model_name
+            adapter_file = Path(adapter_path)
+
+            if not model_path.exists() or not adapter_file.exists():
+                return False
+
+            with self.model_lock:
+                # Lade Base Model mit Adapter
+                self.finetuned_model = load(
+                    str(model_path),
+                    adapter_path=str(adapter_file)
+                )
+
+            return True
+
+        except Exception as e:
+            print(f"Error loading finetuned model: {e}")
+            return False
+
+    def generate(self, prompt: str, model_type: str = 'base',
+                max_tokens: int = 512, temperature: float = 0.7) -> str:
+        """
+        Generiert Text mit dem gewählten Modell
+
+        Args:
+            prompt: Input prompt
+            model_type: 'base' oder 'finetuned'
+            max_tokens: Maximale Anzahl Tokens
+            temperature: Sampling temperature
+
+        Returns:
+            Generierter Text
+        """
+        try:
+            from mlx_lm import generate as mlx_generate
+
+            model = self.base_model if model_type == 'base' else self.finetuned_model
+
+            if model is None:
+                return f"Error: {model_type} model not loaded"
+
+            with self.model_lock:
+                # MLX-LM generate
+                result = mlx_generate(
+                    model,
+                    prompt=prompt,
+                    max_tokens=max_tokens,
+                    temp=temperature
+                )
+
+            return result
+
+        except Exception as e:
+            return f"Error during generation: {str(e)}"
+
+    def generate_comparison(self, prompt: str, max_tokens: int = 512,
+                          temperature: float = 0.7) -> Dict[str, str]:
+        """
+        Generiert mit beiden Modellen für Vergleich
+
+        Returns:
+            Dict mit 'base' und 'finetuned' Outputs
+        """
+        result = {
+            'base': None,
+            'finetuned': None
+        }
+
+        if self.base_model:
+            result['base'] = self.generate(
+                prompt, 'base', max_tokens, temperature
+            )
+
+        if self.finetuned_model:
+            result['finetuned'] = self.generate(
+                prompt, 'finetuned', max_tokens, temperature
+            )
+
+        return result
+
+    def format_mail_prompt(self, task_type: str, mail_body: str) -> str:
+        """Formatiert einen Prompt basierend auf Task-Type"""
+
+        task_prompts = {
+            'Zusammenfassen': 'Fasse folgende E-Mail zusammen:',
+            'Antwort schreiben': 'Schreibe eine Antwort auf folgende E-Mail:',
+            'Kategorisieren': 'Kategorisiere folgende E-Mail:',
+            'Action Items': 'Extrahiere die Action Items aus folgender E-Mail:',
+            'Custom': 'Bearbeite folgende E-Mail:'
+        }
+
+        instruction = task_prompts.get(task_type, task_prompts['Custom'])
+
+        return f"{instruction}\n\n{mail_body}"
+
+    def get_test_prompts(self) -> Dict[str, str]:
+        """Vordefinierte Test-Prompts"""
+        return {
+            'Zusammenfassen': self.format_mail_prompt(
+                'Zusammenfassen',
+                """Betreff: Q4 Projektupdate
+
+Hallo Team,
+
+ich wollte euch ein kurzes Update zum aktuellen Projektstand geben.
+
+Wir haben letzte Woche die neue API-Integration abgeschlossen und erfolgreich getestet.
+Die Performance-Tests zeigen eine Verbesserung von 40% gegenüber der alten Implementierung.
+
+Nächste Woche starten wir mit der Frontend-Anpassung. Maria und Tom werden das Design
+überarbeiten, während ich mich um die Backend-Anbindung kümmere.
+
+Der Go-Live ist weiterhin für Ende des Monats geplant.
+
+Beste Grüße
+Alex"""
+            ),
+            'Antwort schreiben': self.format_mail_prompt(
+                'Antwort schreiben',
+                """Betreff: Frage zu Invoice #2847
+
+Hallo,
+
+ich habe eine Frage zur Rechnung #2847 vom 15. März.
+Der Betrag scheint nicht mit unserem Angebot übereinzustimmen.
+
+Könnten Sie das bitte prüfen?
+
+Danke
+Michael"""
+            ),
+            'Action Items': self.format_mail_prompt(
+                'Action Items',
+                """Betreff: Meeting Notes - Produktlaunch
+
+Hi alle,
+
+hier die wichtigsten Punkte vom heutigen Meeting:
+
+- Sarah bereitet die Pressemitteilung vor (Deadline: Freitag)
+- Marketing-Team erstellt Social Media Content (nächste Woche)
+- Ich kümmere mich um die Influencer-Kontakte
+- Wir brauchen noch finale Produktfotos vom Design-Team
+- Launch-Event ist am 1. April - Location muss noch gebucht werden
+
+Bitte gebt bis Mittwoch Bescheid ob ihr eure Aufgaben schaffen könnt.
+
+Lisa"""
+            )
+        }
+
+    def unload_models(self):
+        """Entlädt Modelle aus dem Speicher"""
+        with self.model_lock:
+            self.base_model = None
+            self.finetuned_model = None
+
+    def get_loaded_models(self) -> Dict[str, bool]:
+        """Gibt zurück welche Modelle geladen sind"""
+        return {
+            'base': self.base_model is not None,
+            'finetuned': self.finetuned_model is not None
+        }
@@ -0,0 +1,264 @@
+"""
+Mail Parser für verschiedene Formate
+Bereinigt und normalisiert Mail-Inhalte
+"""
+
+import email
+import mailbox
+import re
+from bs4 import BeautifulSoup
+from typing import List, Dict, Optional
+from pathlib import Path
+import chardet
+
+
+class MailParser:
+    """Parst und bereinigt Mail-Dateien"""
+
+    # Häufige Footer/Disclaimer Pattern
+    FOOTER_PATTERNS = [
+        r'(?i)^--\s*$.*',  # Standard signature delimiter
+        r'(?i)Diese E-Mail.*vertraulich.*',
+        r'(?i)This email.*confidential.*',
+        r'(?i)Disclaimer:.*',
+        r'(?i)Get Outlook for.*',
+        r'(?i)Sent from my iPhone.*',
+        r'(?i)Von meinem.*gesendet.*',
+        r'(?i)Diese Nachricht.*Virenfrei.*',
+    ]
+
+    @staticmethod
+    def detect_encoding(file_path: Path) -> str:
+        """Erkennt das Encoding einer Datei"""
+        with open(file_path, 'rb') as f:
+            raw_data = f.read()
+            result = chardet.detect(raw_data)
+            return result['encoding'] or 'utf-8'
+
+    @staticmethod
+    def html_to_text(html: str) -> str:
+        """Konvertiert HTML zu Plain Text"""
+        soup = BeautifulSoup(html, 'html.parser')
+
+        # Entferne Script und Style Tags
+        for script in soup(['script', 'style']):
+            script.decompose()
+
+        # Extrahiere Text
+        text = soup.get_text()
+
+        # Bereinige Whitespace
+        lines = (line.strip() for line in text.splitlines())
+        chunks = (phrase.strip() for line in lines for phrase in line.split("  "))
+        text = ' '.join(chunk for chunk in chunks if chunk)
+
+        return text
+
+    @staticmethod
+    def remove_multiple_newlines(text: str) -> str:
+        """Entfernt mehrfache Leerzeilen"""
+        return re.sub(r'\n{3,}', '\n\n', text)
+
+    @staticmethod
+    def remove_footers(text: str) -> str:
+        """Entfernt häufige Footer und Disclaimer"""
+        for pattern in MailParser.FOOTER_PATTERNS:
+            # Suche Pattern und entferne alles danach
+            match = re.search(pattern, text, re.MULTILINE | re.DOTALL)
+            if match:
+                text = text[:match.start()].strip()
+
+        return text
+
+    @staticmethod
+    def clean_quoted_text(text: str) -> str:
+        """Entfernt oder markiert quoted Text (> oder |)"""
+        lines = text.split('\n')
+        cleaned_lines = []
+
+        for line in lines:
+            # Überspringe Zeilen die mit > oder | beginnen (quoted text)
+            if not line.strip().startswith('>') and not line.strip().startswith('|'):
+                cleaned_lines.append(line)
+
+        return '\n'.join(cleaned_lines)
+
+    @staticmethod
+    def normalize_whitespace(text: str) -> str:
+        """Normalisiert Whitespace"""
+        # Entferne trailing spaces
+        lines = [line.rstrip() for line in text.split('\n')]
+        text = '\n'.join(lines)
+
+        # Entferne mehrfache Spaces
+        text = re.sub(r' {2,}', ' ', text)
+
+        # Entferne mehrfache Leerzeilen
+        text = MailParser.remove_multiple_newlines(text)
+
+        return text.strip()
+
+    @staticmethod
+    def clean_text(text: str, is_html: bool = False) -> str:
+        """Vollständige Bereinigung eines Texts"""
+        if is_html:
+            text = MailParser.html_to_text(text)
+
+        text = MailParser.remove_footers(text)
+        text = MailParser.clean_quoted_text(text)
+        text = MailParser.normalize_whitespace(text)
+
+        return text
+
+    @staticmethod
+    def parse_eml(file_path: Path) -> Dict:
+        """Parst eine .eml Datei"""
+        encoding = MailParser.detect_encoding(file_path)
+
+        with open(file_path, 'r', encoding=encoding, errors='ignore') as f:
+            msg = email.message_from_file(f)
+
+        subject = msg.get('Subject', 'No Subject')
+        sender = msg.get('From', 'Unknown')
+        recipient = msg.get('To', 'Unknown')
+        date = msg.get('Date', '')
+
+        # Body extrahieren
+        body = ""
+        is_html = False
+
+        if msg.is_multipart():
+            for part in msg.walk():
+                content_type = part.get_content_type()
+                if content_type == 'text/plain':
+                    body = part.get_payload(decode=True).decode(errors='ignore')
+                    break
+                elif content_type == 'text/html' and not body:
+                    body = part.get_payload(decode=True).decode(errors='ignore')
+                    is_html = True
+        else:
+            body = msg.get_payload(decode=True).decode(errors='ignore')
+            if msg.get_content_type() == 'text/html':
+                is_html = True
+
+        # Bereinige Body
+        body = MailParser.clean_text(body, is_html)
+
+        return {
+            'subject': subject,
+            'sender': sender,
+            'recipient': recipient,
+            'date': date,
+            'body': body,
+            'original_format': 'eml'
+        }
+
+    @staticmethod
+    def parse_mbox(file_path: Path) -> List[Dict]:
+        """Parst eine .mbox Datei"""
+        mails = []
+
+        try:
+            mbox = mailbox.mbox(str(file_path))
+
+            for message in mbox:
+                subject = message.get('Subject', 'No Subject')
+                sender = message.get('From', 'Unknown')
+                recipient = message.get('To', 'Unknown')
+                date = message.get('Date', '')
+
+                body = ""
+                is_html = False
+
+                if message.is_multipart():
+                    for part in message.walk():
+                        content_type = part.get_content_type()
+                        if content_type == 'text/plain':
+                            payload = part.get_payload(decode=True)
+                            if payload:
+                                body = payload.decode(errors='ignore')
+                            break
+                        elif content_type == 'text/html' and not body:
+                            payload = part.get_payload(decode=True)
+                            if payload:
+                                body = payload.decode(errors='ignore')
+                                is_html = True
+                else:
+                    payload = message.get_payload(decode=True)
+                    if payload:
+                        body = payload.decode(errors='ignore')
+                        if message.get_content_type() == 'text/html':
+                            is_html = True
+
+                body = MailParser.clean_text(body, is_html)
+
+                mails.append({
+                    'subject': subject,
+                    'sender': sender,
+                    'recipient': recipient,
+                    'date': date,
+                    'body': body,
+                    'original_format': 'mbox'
+                })
+
+        except Exception as e:
+            raise Exception(f"Error parsing mbox: {str(e)}")
+
+        return mails
+
+    @staticmethod
+    def parse_txt(file_path: Path) -> Dict:
+        """Parst eine .txt Datei (simple Mail als Text)"""
+        encoding = MailParser.detect_encoding(file_path)
+
+        with open(file_path, 'r', encoding=encoding, errors='ignore') as f:
+            content = f.read()
+
+        # Einfache Struktur: Versuche Subject/From/To zu erkennen
+        lines = content.split('\n')
+        subject = 'No Subject'
+        sender = 'Unknown'
+        recipient = 'Unknown'
+        date = ''
+        body_start = 0
+
+        for i, line in enumerate(lines[:10]):  # Erste 10 Zeilen prüfen
+            if line.lower().startswith('subject:'):
+                subject = line[8:].strip()
+                body_start = max(body_start, i + 1)
+            elif line.lower().startswith('from:'):
+                sender = line[5:].strip()
+                body_start = max(body_start, i + 1)
+            elif line.lower().startswith('to:'):
+                recipient = line[3:].strip()
+                body_start = max(body_start, i + 1)
+            elif line.lower().startswith('date:'):
+                date = line[5:].strip()
+                body_start = max(body_start, i + 1)
+
+        # Body ist der Rest
+        body = '\n'.join(lines[body_start:])
+        body = MailParser.clean_text(body)
+
+        return {
+            'subject': subject,
+            'sender': sender,
+            'recipient': recipient,
+            'date': date,
+            'body': body,
+            'original_format': 'txt'
+        }
+
+    @staticmethod
+    def parse_file(file_path: Path) -> List[Dict]:
+        """Parst eine Mail-Datei basierend auf Endung"""
+        suffix = file_path.suffix.lower()
+
+        if suffix == '.eml':
+            return [MailParser.parse_eml(file_path)]
+        elif suffix == '.mbox':
+            return MailParser.parse_mbox(file_path)
+        elif suffix == '.txt':
+            return [MailParser.parse_txt(file_path)]
+        else:
+            raise ValueError(f"Unsupported file format: {suffix}")
@@ -0,0 +1,396 @@
+"""
+FastAPI Backend für Mail Fine-Tuning App
+Hauptanwendung mit allen API Endpoints
+"""
+
+from fastapi import FastAPI, File, UploadFile, HTTPException, BackgroundTasks
+from fastapi.responses import StreamingResponse, FileResponse
+from fastapi.staticfiles import StaticFiles
+from fastapi.middleware.cors import CORSMiddleware
+from pydantic import BaseModel
+from typing import Optional, List
+import asyncio
+import json
+from pathlib import Path
+import shutil
+
+from data_manager import DataManager
+from mail_parser import MailParser
+from training import MLXTrainer, TrainingConfig
+from inference import ModelInference
+
+# FastAPI App
+app = FastAPI(title="Mail Fine-Tuning App")
+
+# CORS
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+
+# Initialisiere Manager
+data_manager = DataManager("data/mails.db")
+trainer = MLXTrainer("models", "output")
+inference = ModelInference("models", "output")
+
+
+# Pydantic Models
+class MailUpdate(BaseModel):
+    task_type: Optional[str] = None
+    expected_output: Optional[str] = None
+    status: Optional[str] = None
+    body: Optional[str] = None
+
+
+class TrainingStartRequest(BaseModel):
+    model_name: str
+    learning_rate: float = 1e-5
+    epochs: int = 3
+    batch_size: int = 4
+    lora_rank: int = 8
+
+
+class InferenceRequest(BaseModel):
+    prompt: str
+    model_type: str = 'base'
+    max_tokens: int = 512
+    temperature: float = 0.7
+
+
+class InferenceComparisonRequest(BaseModel):
+    task_type: str
+    mail_body: str
+    max_tokens: int = 512
+    temperature: float = 0.7
+
+
+# ===== Mail Endpoints =====
+
+@app.post("/api/mails/upload")
+async def upload_mails(files: List[UploadFile] = File(...)):
+    """Upload und Parse von Mail-Dateien"""
+    results = {
+        'success': [],
+        'errors': []
+    }
+
+    for file in files:
+        try:
+            # Temporär speichern
+            temp_path = Path("data/temp") / file.filename
+            temp_path.parent.mkdir(parents=True, exist_ok=True)
+
+            with open(temp_path, 'wb') as f:
+                content = await file.read()
+                f.write(content)
+
+            # Parse Mails
+            parsed_mails = MailParser.parse_file(temp_path)
+
+            # In DB speichern
+            for mail in parsed_mails:
+                mail_id = data_manager.add_mail(
+                    subject=mail['subject'],
+                    sender=mail['sender'],
+                    recipient=mail['recipient'],
+                    date=mail['date'],
+                    body=mail['body'],
+                    original_format=mail['original_format']
+                )
+
+            results['success'].append({
+                'filename': file.filename,
+                'count': len(parsed_mails)
+            })
+
+            # Cleanup
+            temp_path.unlink()
+
+        except Exception as e:
+            results['errors'].append({
+                'filename': file.filename,
+                'error': str(e)
+            })
+
+    return results
+
+
+@app.get("/api/mails")
+async def get_mails(status: Optional[str] = None):
+    """Liste aller Mails"""
+    mails = data_manager.get_all_mails(status_filter=status)
+    return {'mails': mails}
+
+
+@app.get("/api/mails/{mail_id}")
+async def get_mail(mail_id: int):
+    """Einzelne Mail abrufen"""
+    mail = data_manager.get_mail(mail_id)
+    if not mail:
+        raise HTTPException(status_code=404, detail="Mail not found")
+    return mail
+
+
+@app.put("/api/mails/{mail_id}")
+async def update_mail(mail_id: int, update: MailUpdate):
+    """Mail aktualisieren (Labeling)"""
+    success = data_manager.update_mail(
+        mail_id=mail_id,
+        task_type=update.task_type,
+        expected_output=update.expected_output,
+        status=update.status,
+        body=update.body
+    )
+
+    if not success:
+        raise HTTPException(status_code=404, detail="Mail not found")
+
+    return {'success': True}
+
+
+@app.delete("/api/mails/{mail_id}")
+async def delete_mail(mail_id: int):
+    """Mail löschen"""
+    success = data_manager.delete_mail(mail_id)
+
+    if not success:
+        raise HTTPException(status_code=404, detail="Mail not found")
+
+    return {'success': True}
+
+
+# ===== Export Endpoints =====
+
+@app.get("/api/export/stats")
+async def get_stats():
+    """Statistiken abrufen"""
+    stats = data_manager.get_statistics()
+    return stats
+
+
+@app.post("/api/export/jsonl")
+async def export_jsonl(train_split: float = 0.9):
+    """Exportiert Training-Daten als JSONL"""
+    train_data, val_data = data_manager.export_training_data(train_split)
+
+    if not train_data:
+        raise HTTPException(status_code=400, detail="No labeled data available")
+
+    # Speichere Files
+    data_dir = Path("data")
+    train_file = data_dir / "train.jsonl"
+    val_file = data_dir / "val.jsonl"
+
+    train_file_path, val_file_path = trainer.prepare_training_data(
+        train_data, val_data, data_dir
+    )
+
+    return {
+        'success': True,
+        'train_samples': len(train_data),
+        'val_samples': len(val_data),
+        'train_file': str(train_file),
+        'val_file': str(val_file)
+    }
+
+
+@app.get("/api/export/download/{file_type}")
+async def download_file(file_type: str):
+    """Download JSONL Files"""
+    if file_type not in ['train', 'val']:
+        raise HTTPException(status_code=400, detail="Invalid file type")
+
+    file_path = Path("data") / f"{file_type}.jsonl"
+
+    if not file_path.exists():
+        raise HTTPException(status_code=404, detail="File not found")
+
+    return FileResponse(
+        path=file_path,
+        filename=f"{file_type}.jsonl",
+        media_type='application/json'
+    )
+
+
+# ===== Model Endpoints =====
+
+@app.get("/api/models")
+async def list_models():
+    """Liste verfügbarer Modelle"""
+    models = trainer.list_available_models()
+    return {'models': models}
+
+
+@app.post("/api/models/download")
+async def download_model(model_name: str):
+    """
+    Lädt ein Modell herunter
+    Placeholder - würde in echter Implementation huggingface nutzen
+    """
+    success = trainer.download_model(model_name)
+
+    if not success:
+        raise HTTPException(
+            status_code=501,
+            detail="Model download not implemented. Please download manually."
+        )
+
+    return {'success': True}
+
+
+# ===== Training Endpoints =====
+
+@app.post("/api/training/start")
+async def start_training(request: TrainingStartRequest, background_tasks: BackgroundTasks):
+    """Startet Training"""
+
+    # Hole Training-Daten
+    train_data, val_data = data_manager.export_training_data()
+
+    if not train_data:
+        raise HTTPException(status_code=400, detail="No labeled data available")
+
+    if len(train_data) < 10:
+        raise HTTPException(
+            status_code=400,
+            detail=f"Not enough training data. Need at least 10, got {len(train_data)}"
+        )
+
+    # Training Config
+    config = TrainingConfig(
+        model_name=request.model_name,
+        learning_rate=request.learning_rate,
+        epochs=request.epochs,
+        batch_size=request.batch_size,
+        lora_rank=request.lora_rank
+    )
+
+    # Starte Training
+    success = trainer.start_training(config, train_data, val_data)
+
+    if not success:
+        raise HTTPException(status_code=400, detail="Training already running")
+
+    return {'success': True, 'message': 'Training started'}
+
+
+@app.post("/api/training/stop")
+async def stop_training():
+    """Stoppt Training"""
+    success = trainer.stop_training()
+
+    if not success:
+        raise HTTPException(status_code=400, detail="No training running")
+
+    return {'success': True, 'message': 'Training stopped'}
+
+
+@app.get("/api/training/status")
+async def get_training_status():
+    """Gibt aktuellen Training-Status zurück"""
+    status = trainer.get_status()
+    return status
+
+
+@app.get("/api/training/stream")
+async def stream_training_status():
+    """
+    Server-Sent Events für Live-Updates
+    """
+    async def event_generator():
+        while True:
+            status = trainer.get_status()
+
+            # Sende Status als SSE
+            yield f"data: {json.dumps(status)}\n\n"
+
+            # Stop wenn Training fertig
+            if not status['is_training'] and status['current_step'] > 0:
+                break
+
+            await asyncio.sleep(1)
+
+    return StreamingResponse(
+        event_generator(),
+        media_type="text/event-stream"
+    )
+
+
+# ===== Inference Endpoints =====
+
+@app.post("/api/inference/load")
+async def load_model(model_type: str, model_name: str, adapter_path: Optional[str] = None):
+    """Lädt ein Modell für Inference"""
+
+    if model_type == 'base':
+        success = inference.load_base_model(model_name)
+    elif model_type == 'finetuned':
+        if not adapter_path:
+            raise HTTPException(status_code=400, detail="adapter_path required for finetuned model")
+        success = inference.load_finetuned_model(model_name, adapter_path)
+    else:
+        raise HTTPException(status_code=400, detail="Invalid model_type")
+
+    if not success:
+        raise HTTPException(status_code=400, detail="Failed to load model")
+
+    return {'success': True}
+
+
+@app.get("/api/inference/loaded")
+async def get_loaded_models():
+    """Gibt zurück welche Modelle geladen sind"""
+    loaded = inference.get_loaded_models()
+    return loaded
+
+
+@app.post("/api/inference/generate")
+async def generate_text(request: InferenceRequest):
+    """Generiert Text mit geladenem Modell"""
+    result = inference.generate(
+        prompt=request.prompt,
+        model_type=request.model_type,
+        max_tokens=request.max_tokens,
+        temperature=request.temperature
+    )
+
+    return {'result': result}
+
+
+@app.post("/api/inference/compare")
+async def compare_models(request: InferenceComparisonRequest):
+    """Vergleicht Base und Fine-tuned Model"""
+
+    prompt = inference.format_mail_prompt(
+        request.task_type,
+        request.mail_body
+    )
+
+    result = inference.generate_comparison(
+        prompt=prompt,
+        max_tokens=request.max_tokens,
+        temperature=request.temperature
+    )
+
+    return result
+
+
+@app.get("/api/inference/test-prompts")
+async def get_test_prompts():
+    """Gibt vordefinierte Test-Prompts zurück"""
+    prompts = inference.get_test_prompts()
+    return prompts
+
+
+# ===== Static Files =====
+
+# Serve Frontend
+app.mount("/", StaticFiles(directory="frontend", html=True), name="frontend")
+
+
+if __name__ == "__main__":
+    import uvicorn
+    uvicorn.run(app, host="0.0.0.0", port=8000)
@@ -0,0 +1,321 @@
+"""
+MLX Training Wrapper für Fine-Tuning
+Nutzt mlx-lm für LoRA Fine-Tuning
+"""
+
+import json
+import time
+import psutil
+from pathlib import Path
+from typing import Dict, List, Callable, Optional
+from dataclasses import dataclass
+import threading
+import queue
+
+
+@dataclass
+class TrainingConfig:
+    """Training Konfiguration"""
+    model_name: str
+    learning_rate: float = 1e-5
+    epochs: int = 3
+    batch_size: int = 4
+    lora_rank: int = 8
+    lora_alpha: int = 16
+    max_seq_length: int = 2048
+    val_every: int = 50
+
+
+class TrainingStatus:
+    """Verwaltet den aktuellen Training-Status"""
+
+    def __init__(self):
+        self.is_training = False
+        self.should_stop = False
+        self.current_step = 0
+        self.total_steps = 0
+        self.current_epoch = 0
+        self.train_loss = 0.0
+        self.val_loss = 0.0
+        self.train_loss_history = []
+        self.val_loss_history = []
+        self.start_time = None
+        self.error = None
+
+    def reset(self):
+        """Setzt den Status zurück"""
+        self.is_training = False
+        self.should_stop = False
+        self.current_step = 0
+        self.total_steps = 0
+        self.current_epoch = 0
+        self.train_loss = 0.0
+        self.val_loss = 0.0
+        self.train_loss_history = []
+        self.val_loss_history = []
+        self.start_time = None
+        self.error = None
+
+    def to_dict(self) -> Dict:
+        """Konvertiert zu Dictionary für API"""
+        eta = None
+        if self.is_training and self.current_step > 0 and self.start_time:
+            elapsed = time.time() - self.start_time
+            steps_remaining = self.total_steps - self.current_step
+            eta = int((elapsed / self.current_step) * steps_remaining)
+
+        memory_usage = psutil.virtual_memory().percent
+
+        return {
+            'is_training': self.is_training,
+            'current_step': self.current_step,
+            'total_steps': self.total_steps,
+            'current_epoch': self.current_epoch,
+            'train_loss': round(self.train_loss, 4) if self.train_loss else None,
+            'val_loss': round(self.val_loss, 4) if self.val_loss else None,
+            'train_loss_history': [round(l, 4) for l in self.train_loss_history],
+            'val_loss_history': [round(l, 4) for l in self.val_loss_history],
+            'eta_seconds': eta,
+            'memory_usage_percent': memory_usage,
+            'error': self.error
+        }
+
+
+class MLXTrainer:
+    """Wrapper für MLX Training"""
+
+    def __init__(self, models_dir: str = "models", output_dir: str = "output"):
+        self.models_dir = Path(models_dir)
+        self.output_dir = Path(output_dir)
+        self.models_dir.mkdir(exist_ok=True)
+        self.output_dir.mkdir(exist_ok=True)
+
+        self.status = TrainingStatus()
+        self.training_thread = None
+
+    def prepare_training_data(self, train_data: List[Dict],
+                            val_data: List[Dict],
+                            data_dir: Path) -> tuple[Path, Path]:
+        """Konvertiert Daten ins MLX Format (JSONL)"""
+
+        def format_example(item: Dict) -> Dict:
+            """Formatiert ein Beispiel im Chat-Format"""
+            task_type = item['task_type']
+            body = item['body']
+            output = item['expected_output']
+
+            # Task-spezifische Prompts
+            task_prompts = {
+                'Zusammenfassen': 'Fasse folgende E-Mail zusammen:',
+                'Antwort schreiben': 'Schreibe eine Antwort auf folgende E-Mail:',
+                'Kategorisieren': 'Kategorisiere folgende E-Mail:',
+                'Action Items': 'Extrahiere die Action Items aus folgender E-Mail:',
+                'Custom': 'Bearbeite folgende E-Mail:'
+            }
+
+            instruction = task_prompts.get(task_type, task_prompts['Custom'])
+
+            return {
+                'messages': [
+                    {
+                        'role': 'user',
+                        'content': f"{instruction}\n\n{body}"
+                    },
+                    {
+                        'role': 'assistant',
+                        'content': output
+                    }
+                ]
+            }
+
+        train_file = data_dir / 'train.jsonl'
+        val_file = data_dir / 'val.jsonl'
+
+        # Schreibe Training Data
+        with open(train_file, 'w', encoding='utf-8') as f:
+            for item in train_data:
+                f.write(json.dumps(format_example(item), ensure_ascii=False) + '\n')
+
+        # Schreibe Validation Data
+        with open(val_file, 'w', encoding='utf-8') as f:
+            for item in val_data:
+                f.write(json.dumps(format_example(item), ensure_ascii=False) + '\n')
+
+        return train_file, val_file
+
+    def _run_training(self, config: TrainingConfig,
+                     train_file: Path, val_file: Path,
+                     output_path: Path):
+        """Führt das Training aus (läuft in eigenem Thread)"""
+        try:
+            # Import hier um MLX nur bei Bedarf zu laden
+            from mlx_lm import load, LoRALinear
+            from mlx_lm.tuner import train as mlx_train
+            import mlx.core as mx
+            import mlx.nn as nn
+            import mlx.optimizers as optim
+
+            self.status.is_training = True
+            self.status.start_time = time.time()
+            self.status.error = None
+
+            # Lade Modell
+            model_path = self.models_dir / config.model_name
+            if not model_path.exists():
+                raise FileNotFoundError(f"Model not found: {model_path}")
+
+            # Training durchführen mit mlx-lm
+            # Dies ist ein vereinfachtes Beispiel - mlx-lm hat eigene Trainer
+            # In der Praxis würde man mlx_lm.tuner verwenden
+
+            # Lade Training Config
+            train_config = {
+                'model': str(model_path),
+                'data': str(train_file),
+                'val_data': str(val_file),
+                'train': True,
+                'iters': config.epochs * 100,  # Approximation
+                'val_batches': 10,
+                'learning_rate': config.learning_rate,
+                'batch_size': config.batch_size,
+                'lora_layers': config.lora_rank,
+                'adapter_file': str(output_path / 'adapters.npz'),
+                'save_every': 50,
+                'val_every': config.val_every,
+            }
+
+            # Callback für Progress-Updates
+            def training_callback(step: int, loss: float, val_loss: Optional[float] = None):
+                if self.status.should_stop:
+                    return False  # Stop training
+
+                self.status.current_step = step
+                self.status.train_loss = loss
+                self.status.train_loss_history.append(loss)
+
+                if val_loss is not None:
+                    self.status.val_loss = val_loss
+                    self.status.val_loss_history.append(val_loss)
+
+                return True
+
+            # Hinweis: Dies ist ein Platzhalter für echtes MLX Training
+            # In der Praxis würde man mlx_lm.tuner.train() oder eine
+            # eigene Training Loop mit mlx nutzen
+
+            # Simuliere Training für Demo (MUSS durch echtes MLX Training ersetzt werden)
+            total_steps = config.epochs * (len(list(open(train_file))) // config.batch_size)
+            self.status.total_steps = total_steps
+
+            for epoch in range(config.epochs):
+                self.status.current_epoch = epoch + 1
+
+                for step in range(total_steps // config.epochs):
+                    if self.status.should_stop:
+                        break
+
+                    # Simuliere Training Step
+                    self.status.current_step = epoch * (total_steps // config.epochs) + step
+                    fake_loss = 2.0 - (self.status.current_step / total_steps) * 1.5
+                    self.status.train_loss = fake_loss
+                    self.status.train_loss_history.append(fake_loss)
+
+                    # Validation alle N Steps
+                    if step % config.val_every == 0:
+                        fake_val_loss = 2.2 - (self.status.current_step / total_steps) * 1.4
+                        self.status.val_loss = fake_val_loss
+                        self.status.val_loss_history.append(fake_val_loss)
+
+                    time.sleep(0.1)  # Simuliere Rechenzeit
+
+                if self.status.should_stop:
+                    break
+
+            # Speichere finale Adapter
+            # output_path / 'adapters.npz' würde die LoRA Weights enthalten
+
+            self.status.is_training = False
+
+        except Exception as e:
+            self.status.error = str(e)
+            self.status.is_training = False
+
+    def start_training(self, config: TrainingConfig,
+                      train_data: List[Dict],
+                      val_data: List[Dict]) -> bool:
+        """Startet das Training"""
+
+        if self.status.is_training:
+            return False
+
+        # Bereite Daten vor
+        data_dir = self.output_dir / f"training_{int(time.time())}"
+        data_dir.mkdir(exist_ok=True)
+
+        train_file, val_file = self.prepare_training_data(
+            train_data, val_data, data_dir
+        )
+
+        # Output-Pfad
+        output_path = self.output_dir / f"run_{int(time.time())}"
+        output_path.mkdir(exist_ok=True)
+
+        # Reset Status
+        self.status.reset()
+
+        # Starte Training in eigenem Thread
+        self.training_thread = threading.Thread(
+            target=self._run_training,
+            args=(config, train_file, val_file, output_path),
+            daemon=True
+        )
+        self.training_thread.start()
+
+        return True
+
+    def stop_training(self) -> bool:
+        """Stoppt das laufende Training"""
+        if not self.status.is_training:
+            return False
+
+        self.status.should_stop = True
+
+        # Warte max 5 Sekunden auf Thread
+        if self.training_thread:
+            self.training_thread.join(timeout=5)
+
+        return True
+
+    def get_status(self) -> Dict:
+        """Gibt aktuellen Status zurück"""
+        return self.status.to_dict()
+
+    def list_available_models(self) -> List[str]:
+        """Listet verfügbare Modelle auf"""
+        if not self.models_dir.exists():
+            return []
+
+        models = []
+        for path in self.models_dir.iterdir():
+            if path.is_dir():
+                models.append(path.name)
+
+        return models
+
+    def download_model(self, model_name: str) -> bool:
+        """
+        Lädt ein Modell herunter
+        In der Praxis würde man hier huggingface_hub nutzen
+        """
+        # Placeholder - würde huggingface_hub.snapshot_download nutzen
+        # und dann mit mlx_lm.convert konvertieren
+
+        # Beispiel:
+        # from huggingface_hub import snapshot_download
+        # from mlx_lm.convert import convert
+        #
+        # hf_path = snapshot_download(model_name)
+        # mlx_path = self.models_dir / model_name
+        # convert(hf_path, mlx_path)
+
+        return False  # Nicht implementiert in diesem Beispiel
@@ -0,0 +1,87 @@
+# Beispiel-Mails für Training
+
+Diese Beispiel-Mails können zum Testen des Mail-Imports verwendet werden.
+
+## Enthaltene Beispiele
+
+1. **test1.txt** - Projekt-Update
+   - Typ: Status-Update
+   - Empfohlen für: "Zusammenfassen"
+
+2. **test2.txt** - Kundenanfrage
+   - Typ: Support-Anfrage
+   - Empfohlen für: "Antwort schreiben"
+
+3. **test3.txt** - Meeting Notes
+   - Typ: Meeting-Protokoll
+   - Empfohlen für: "Action Items"
+
+4. **test4.txt** - Out of Office
+   - Typ: Automatische Antwort
+   - Empfohlen für: "Kategorisieren" (als "Automatisch" oder "Skip")
+
+## Verwendung
+
+1. Wähle eine oder mehrere Dateien aus
+2. Ziehe sie per Drag & Drop in die App
+3. Die Mails werden automatisch geparst und bereinigt
+4. Gehe zum Labeling und füge die erwarteten Outputs hinzu
+
+## Beispiel-Labels
+
+### test1.txt (Zusammenfassen)
+```
+Alex berichtet über erfolgreichen Abschluss der API-Integration mit 40% Performance-Verbesserung.
+Nächste Woche starten Frontend-Anpassungen durch Maria und Tom.
+Go-Live bleibt für Ende März geplant.
+```
+
+### test2.txt (Antwort schreiben)
+```
+Sehr geehrter Herr Schmidt,
+
+vielen Dank für Ihre Anfrage zu Rechnung #2847.
+
+Sie haben recht - hier ist uns ein Fehler unterlaufen. Der korrekte Betrag
+laut Angebot beträgt 1.250€. Wir werden die Rechnung korrigieren und Ihnen
+die berichtigte Version bis morgen zusenden.
+
+Wir entschuldigen uns für die Unannehmlichkeiten.
+
+Mit freundlichen Grüßen
+Support-Team
+```
+
+### test3.txt (Action Items)
+```
+- Sarah: Pressemitteilung vorbereiten (Deadline: Freitag)
+- Marketing-Team: Social Media Content erstellen (nächste Woche)
+- Lisa: Influencer-Kontakte aufnehmen
+- Design-Team: Finale Produktfotos liefern
+- Location für Launch-Event buchen (1. April)
+- Website-Landing-Page live schalten (bis Mittwoch)
+- Feedback an Lisa bis Mittwoch
+```
+
+### test4.txt (Kategorisieren)
+```
+Kategorie: Automatische Antwort / Out of Office
+Status: Abwesenheit vom 18.03.-25.03.2024
+Vertretung: sarah.koch@company.com (Vertrieb), support@company.com (Support)
+```
+
+## Eigene Mails hinzufügen
+
+Du kannst auch eigene .txt Dateien erstellen. Format:
+
+```
+Subject: Dein Betreff
+From: absender@example.com
+To: empfaenger@example.com
+Date: 2024-03-15
+
+Hier kommt der Mail-Text...
+```
+
+Die ersten Zeilen mit Subject:/From:/To:/Date: sind optional.
+Wenn sie fehlen, wird der gesamte Text als Mail-Body interpretiert.
@@ -0,0 +1,19 @@
+Subject: Q4 Projektupdate
+From: alex@example.com
+To: team@example.com
+Date: 2024-03-15
+
+Hallo Team,
+
+ich wollte euch ein kurzes Update zum aktuellen Projektstand geben.
+
+Wir haben letzte Woche die neue API-Integration abgeschlossen und erfolgreich getestet.
+Die Performance-Tests zeigen eine Verbesserung von 40% gegenüber der alten Implementierung.
+
+Nächste Woche starten wir mit der Frontend-Anpassung. Maria und Tom werden das Design
+überarbeiten, während ich mich um die Backend-Anbindung kümmere.
+
+Der Go-Live ist weiterhin für Ende des Monats geplant.
+
+Beste Grüße
+Alex
@@ -0,0 +1,16 @@
+Subject: Frage zu Invoice #2847
+From: michael.schmidt@example.com
+To: support@company.de
+Date: 2024-03-16
+
+Hallo,
+
+ich habe eine Frage zur Rechnung #2847 vom 15. März.
+Der Betrag scheint nicht mit unserem ursprünglichen Angebot übereinzustimmen.
+
+Laut Angebot sollten es 1.250€ sein, auf der Rechnung stehen aber 1.450€.
+
+Könnten Sie das bitte prüfen und mir Bescheid geben?
+
+Vielen Dank
+Michael Schmidt
@@ -0,0 +1,22 @@
+Subject: Meeting Notes - Produktlaunch Vorbereitung
+From: lisa.mueller@startup.io
+To: team@startup.io
+Date: 2024-03-17
+
+Hi alle,
+
+hier die wichtigsten Punkte vom heutigen Meeting zum Produktlaunch:
+
+1. Sarah bereitet die Pressemitteilung vor (Deadline: Freitag)
+2. Marketing-Team erstellt Social Media Content für nächste Woche
+3. Ich kümmere mich um die Influencer-Kontakte
+4. Wir brauchen noch finale Produktfotos vom Design-Team
+5. Launch-Event ist am 1. April - Location muss noch gebucht werden
+6. Website-Landing-Page muss bis Mittwoch live gehen
+
+Bitte gebt bis Mittwoch Bescheid ob ihr eure Aufgaben schaffen könnt.
+Bei Problemen sofort melden!
+
+Danke an alle für die tolle Zusammenarbeit!
+
+Lisa
@@ -0,0 +1,24 @@
+Subject: Automatische Antwort: Out of Office
+From: thomas.weber@company.com
+To: request@company.com
+Date: 2024-03-18
+
+Guten Tag,
+
+vielen Dank für Ihre E-Mail.
+
+Ich bin vom 18.03. bis 25.03.2024 nicht im Büro und habe keinen Zugriff auf meine E-Mails.
+
+In dringenden Fällen wenden Sie sich bitte an:
+- Vertrieb: sarah.koch@company.com
+- Support: support@company.com
+- Allgemeine Anfragen: info@company.com
+
+Ich werde Ihre E-Mail nach meiner Rückkehr bearbeiten.
+
+Mit freundlichen Grüßen
+Thomas Weber
+
+--
+Diese E-Mail wurde automatisch generiert.
+Bitte antworten Sie nicht direkt auf diese Nachricht.
@@ -0,0 +1,756 @@
+// Mail Fine-Tuning App - Frontend Logic
+
+const API_BASE = '';
+
+// State
+let currentMails = [];
+let currentLabelingIndex = 0;
+let stats = {};
+let trainingEventSource = null;
+
+// ======================
+// Utility Functions
+// ======================
+
+function showToast(message, type = 'info') {
+    const container = document.getElementById('toast-container');
+    const toast = document.createElement('div');
+    toast.className = `toast ${type}`;
+    toast.textContent = message;
+    container.appendChild(toast);
+
+    setTimeout(() => {
+        toast.remove();
+    }, 4000);
+}
+
+async function apiCall(endpoint, options = {}) {
+    try {
+        const response = await fetch(API_BASE + endpoint, {
+            ...options,
+            headers: {
+                'Content-Type': 'application/json',
+                ...options.headers
+            }
+        });
+
+        if (!response.ok) {
+            const error = await response.json();
+            throw new Error(error.detail || 'API Error');
+        }
+
+        return await response.json();
+    } catch (error) {
+        showToast(error.message, 'error');
+        throw error;
+    }
+}
+
+// ======================
+// Navigation
+// ======================
+
+function initNavigation() {
+    const navLinks = document.querySelectorAll('.nav-link');
+    const views = document.querySelectorAll('.view');
+
+    navLinks.forEach(link => {
+        link.addEventListener('click', (e) => {
+            e.preventDefault();
+
+            const targetView = link.dataset.view;
+
+            // Update active states
+            navLinks.forEach(l => l.classList.remove('active'));
+            link.classList.add('active');
+
+            views.forEach(v => v.classList.remove('active'));
+            document.getElementById(`${targetView}-view`).classList.add('active');
+
+            // Load data for view
+            if (targetView === 'labeling') {
+                loadLabelingView();
+            } else if (targetView === 'export') {
+                loadStats();
+            } else if (targetView === 'models') {
+                loadModels();
+            } else if (targetView === 'training') {
+                loadTrainingView();
+            }
+        });
+    });
+}
+
+// ======================
+// Mail Import
+// ======================
+
+function initImport() {
+    const dropzone = document.getElementById('dropzone');
+    const fileInput = document.getElementById('file-input');
+
+    dropzone.addEventListener('click', () => fileInput.click());
+
+    dropzone.addEventListener('dragover', (e) => {
+        e.preventDefault();
+        dropzone.classList.add('dragover');
+    });
+
+    dropzone.addEventListener('dragleave', () => {
+        dropzone.classList.remove('dragover');
+    });
+
+    dropzone.addEventListener('drop', (e) => {
+        e.preventDefault();
+        dropzone.classList.remove('dragover');
+        handleFiles(e.dataTransfer.files);
+    });
+
+    fileInput.addEventListener('change', (e) => {
+        handleFiles(e.target.files);
+    });
+
+    document.getElementById('refresh-mails').addEventListener('click', loadMails);
+
+    // Initial load
+    loadMails();
+}
+
+async function handleFiles(files) {
+    const formData = new FormData();
+
+    for (let file of files) {
+        formData.append('files', file);
+    }
+
+    try {
+        const response = await fetch(API_BASE + '/api/mails/upload', {
+            method: 'POST',
+            body: formData
+        });
+
+        const result = await response.json();
+
+        const successCount = result.success.reduce((sum, r) => sum + r.count, 0);
+        showToast(`${successCount} Mails erfolgreich importiert`, 'success');
+
+        if (result.errors.length > 0) {
+            showToast(`${result.errors.length} Fehler beim Import`, 'error');
+        }
+
+        loadMails();
+
+    } catch (error) {
+        showToast('Fehler beim Upload', 'error');
+    }
+}
+
+async function loadMails() {
+    try {
+        const data = await apiCall('/api/mails');
+        currentMails = data.mails;
+
+        document.getElementById('mail-count').textContent = currentMails.length;
+
+        renderMailList(currentMails);
+    } catch (error) {
+        console.error('Error loading mails:', error);
+    }
+}
+
+function renderMailList(mails) {
+    const container = document.getElementById('mail-list');
+
+    if (mails.length === 0) {
+        container.innerHTML = '<p style="text-align:center; padding: 2rem;">Keine Mails vorhanden</p>';
+        return;
+    }
+
+    container.innerHTML = mails.map(mail => `
+        <div class="mail-item ${mail.status}">
+            <div class="mail-header">
+                <div class="mail-subject">${escapeHtml(mail.subject)}</div>
+                <div class="mail-meta">${mail.status}</div>
+            </div>
+            <div class="mail-meta">Von: ${escapeHtml(mail.sender)}</div>
+            <div class="mail-body">${escapeHtml(mail.body)}</div>
+            <div class="mail-actions">
+                <button class="btn btn-secondary" onclick="viewMail(${mail.id})">👁️ Ansehen</button>
+                <button class="btn btn-danger" onclick="deleteMail(${mail.id})">🗑️ Löschen</button>
+            </div>
+        </div>
+    `).join('');
+}
+
+function escapeHtml(text) {
+    const div = document.createElement('div');
+    div.textContent = text;
+    return div.innerHTML;
+}
+
+async function deleteMail(id) {
+    if (!confirm('Mail wirklich löschen?')) return;
+
+    try {
+        await apiCall(`/api/mails/${id}`, { method: 'DELETE' });
+        showToast('Mail gelöscht', 'success');
+        loadMails();
+    } catch (error) {
+        console.error('Error deleting mail:', error);
+    }
+}
+
+function viewMail(id) {
+    const mail = currentMails.find(m => m.id === id);
+    if (!mail) return;
+
+    alert(`Betreff: ${mail.subject}\n\nVon: ${mail.sender}\n\n${mail.body}`);
+}
+
+// ======================
+// Labeling
+// ======================
+
+function initLabeling() {
+    const statusFilter = document.getElementById('status-filter');
+    statusFilter.addEventListener('change', loadLabelingView);
+
+    // Keyboard shortcuts
+    document.addEventListener('keydown', (e) => {
+        const activeView = document.querySelector('.view.active');
+        if (activeView.id !== 'labeling-view') return;
+
+        if (e.key.toLowerCase() === 'n') {
+            nextMail();
+        } else if (e.key.toLowerCase() === 's') {
+            saveLabelingMail();
+        } else if (e.key.toLowerCase() === 'k') {
+            skipMail();
+        }
+    });
+}
+
+async function loadLabelingView() {
+    const statusFilter = document.getElementById('status-filter').value;
+
+    try {
+        const data = await apiCall(`/api/mails?status=${statusFilter || ''}`);
+        currentMails = data.mails;
+        currentLabelingIndex = 0;
+
+        updateLabelingProgress();
+        renderCurrentMail();
+    } catch (error) {
+        console.error('Error loading labeling view:', error);
+    }
+}
+
+function updateLabelingProgress() {
+    const labeled = currentMails.filter(m => m.status === 'labeled').length;
+    const total = currentMails.length;
+
+    const percent = total > 0 ? (labeled / total) * 100 : 0;
+
+    document.getElementById('labeling-progress').style.width = `${percent}%`;
+    document.getElementById('progress-text').textContent = `${labeled} / ${total} gelabelt`;
+}
+
+function renderCurrentMail() {
+    const container = document.getElementById('labeling-container');
+
+    if (currentMails.length === 0) {
+        container.innerHTML = '<p>Keine Mails zum Labeln vorhanden</p>';
+        return;
+    }
+
+    const mail = currentMails[currentLabelingIndex];
+
+    container.innerHTML = `
+        <div class="current-mail">
+            <h4>${escapeHtml(mail.subject)}</h4>
+            <p><strong>Von:</strong> ${escapeHtml(mail.sender)}</p>
+            <p><strong>An:</strong> ${escapeHtml(mail.recipient)}</p>
+            <hr style="margin: 1rem 0; border-color: var(--border-color)">
+            <div style="white-space: pre-wrap;">${escapeHtml(mail.body)}</div>
+        </div>
+
+        <form id="labeling-form">
+            <div class="form-group">
+                <label>Aufgabentyp:</label>
+                <select id="task-type" required>
+                    <option value="">-- Wählen --</option>
+                    <option value="Zusammenfassen" ${mail.task_type === 'Zusammenfassen' ? 'selected' : ''}>Zusammenfassen</option>
+                    <option value="Antwort schreiben" ${mail.task_type === 'Antwort schreiben' ? 'selected' : ''}>Antwort schreiben</option>
+                    <option value="Kategorisieren" ${mail.task_type === 'Kategorisieren' ? 'selected' : ''}>Kategorisieren</option>
+                    <option value="Action Items" ${mail.task_type === 'Action Items' ? 'selected' : ''}>Action Items</option>
+                    <option value="Custom" ${mail.task_type === 'Custom' ? 'selected' : ''}>Custom</option>
+                </select>
+            </div>
+
+            <div class="form-group">
+                <label>Erwarteter Output:</label>
+                <textarea id="expected-output" rows="6" required>${mail.expected_output || ''}</textarea>
+            </div>
+
+            <div class="form-actions">
+                <button type="button" class="btn btn-primary" onclick="saveLabelingMail()">💾 Speichern (S)</button>
+                <button type="button" class="btn btn-secondary" onclick="skipMail()">⏭️ Überspringen (K)</button>
+                <button type="button" class="btn btn-secondary" onclick="nextMail()">➡️ Nächste (N)</button>
+                <span style="margin-left: auto; color: var(--text-secondary);">
+                    ${currentLabelingIndex + 1} / ${currentMails.length}
+                </span>
+            </div>
+        </form>
+    `;
+}
+
+async function saveLabelingMail() {
+    const mail = currentMails[currentLabelingIndex];
+    const taskType = document.getElementById('task-type').value;
+    const expectedOutput = document.getElementById('expected-output').value;
+
+    if (!taskType || !expectedOutput) {
+        showToast('Bitte alle Felder ausfüllen', 'warning');
+        return;
+    }
+
+    try {
+        await apiCall(`/api/mails/${mail.id}`, {
+            method: 'PUT',
+            body: JSON.stringify({
+                task_type: taskType,
+                expected_output: expectedOutput,
+                status: 'labeled'
+            })
+        });
+
+        showToast('Gespeichert', 'success');
+        mail.status = 'labeled';
+        updateLabelingProgress();
+        nextMail();
+    } catch (error) {
+        console.error('Error saving mail:', error);
+    }
+}
+
+async function skipMail() {
+    const mail = currentMails[currentLabelingIndex];
+
+    try {
+        await apiCall(`/api/mails/${mail.id}`, {
+            method: 'PUT',
+            body: JSON.stringify({
+                status: 'skip'
+            })
+        });
+
+        mail.status = 'skip';
+        updateLabelingProgress();
+        nextMail();
+    } catch (error) {
+        console.error('Error skipping mail:', error);
+    }
+}
+
+function nextMail() {
+    if (currentLabelingIndex < currentMails.length - 1) {
+        currentLabelingIndex++;
+    } else {
+        currentLabelingIndex = 0;
+    }
+    renderCurrentMail();
+}
+
+// ======================
+// Export & Stats
+// ======================
+
+function initExport() {
+    document.getElementById('export-jsonl').addEventListener('click', exportJSONL);
+}
+
+async function loadStats() {
+    try {
+        stats = await apiCall('/api/export/stats');
+        renderStats();
+    } catch (error) {
+        console.error('Error loading stats:', error);
+    }
+}
+
+function renderStats() {
+    const container = document.getElementById('stats-grid');
+
+    container.innerHTML = `
+        <div class="stat-card">
+            <div class="stat-value">${stats.total || 0}</div>
+            <div class="stat-label">Gesamt Mails</div>
+        </div>
+        <div class="stat-card">
+            <div class="stat-value">${stats.labeled || 0}</div>
+            <div class="stat-label">Gelabelt</div>
+        </div>
+        <div class="stat-card">
+            <div class="stat-value">${stats.unlabeled || 0}</div>
+            <div class="stat-label">Unlabeled</div>
+        </div>
+        <div class="stat-card">
+            <div class="stat-value">${stats.avg_input_length || 0}</div>
+            <div class="stat-label">Avg Input Length</div>
+        </div>
+        <div class="stat-card">
+            <div class="stat-value">${stats.avg_output_length || 0}</div>
+            <div class="stat-label">Avg Output Length</div>
+        </div>
+        <div class="stat-card">
+            <div class="stat-value">${stats.sufficient_data ? '✅' : '❌'}</div>
+            <div class="stat-label">Genug Daten (&gt;50)</div>
+        </div>
+    `;
+}
+
+async function exportJSONL() {
+    const trainSplit = document.getElementById('train-split').value / 100;
+
+    try {
+        const result = await apiCall('/api/export/jsonl', {
+            method: 'POST',
+            body: JSON.stringify({ train_split: trainSplit })
+        });
+
+        const resultDiv = document.getElementById('export-result');
+        resultDiv.innerHTML = `
+            <p>✅ Export erfolgreich!</p>
+            <p>Training Samples: ${result.train_samples}</p>
+            <p>Validation Samples: ${result.val_samples}</p>
+            <p>
+                <a href="/api/export/download/train" class="btn btn-primary" download>📥 train.jsonl</a>
+                <a href="/api/export/download/val" class="btn btn-primary" download>📥 val.jsonl</a>
+            </p>
+        `;
+        resultDiv.classList.add('show');
+
+        showToast('JSONL Dateien generiert', 'success');
+    } catch (error) {
+        console.error('Error exporting JSONL:', error);
+    }
+}
+
+// ======================
+// Models
+// ======================
+
+async function loadModels() {
+    try {
+        const data = await apiCall('/api/models');
+        renderModels(data.models);
+    } catch (error) {
+        console.error('Error loading models:', error);
+    }
+}
+
+function renderModels(models) {
+    const container = document.getElementById('models-list');
+
+    if (models.length === 0) {
+        container.innerHTML = '<p>Keine Modelle vorhanden</p>';
+        return;
+    }
+
+    container.innerHTML = models.map(model => `
+        <div class="model-item">
+            <span>📦 ${model}</span>
+            <span style="color: var(--accent-success);">✓ Verfügbar</span>
+        </div>
+    `).join('');
+}
+
+// ======================
+// Training
+// ======================
+
+function initTraining() {
+    const lrSlider = document.getElementById('learning-rate');
+    const epochsSlider = document.getElementById('epochs');
+
+    lrSlider.addEventListener('input', (e) => {
+        const value = Math.pow(10, parseFloat(e.target.value));
+        document.getElementById('lr-value').textContent = value.toExponential(0);
+    });
+
+    epochsSlider.addEventListener('input', (e) => {
+        document.getElementById('epochs-value').textContent = e.target.value;
+    });
+
+    document.getElementById('training-form').addEventListener('submit', startTraining);
+    document.getElementById('stop-training').addEventListener('click', stopTraining);
+}
+
+async function loadTrainingView() {
+    // Load available models
+    try {
+        const data = await apiCall('/api/models');
+        const select = document.getElementById('training-model');
+
+        select.innerHTML = '<option value="">-- Modell wählen --</option>' +
+            data.models.map(m => `<option value="${m}">${m}</option>`).join('');
+    } catch (error) {
+        console.error('Error loading models:', error);
+    }
+
+    // Get current status
+    updateTrainingStatus();
+}
+
+async function startTraining(e) {
+    e.preventDefault();
+
+    const modelName = document.getElementById('training-model').value;
+    const learningRate = Math.pow(10, parseFloat(document.getElementById('learning-rate').value));
+    const epochs = parseInt(document.getElementById('epochs').value);
+    const batchSize = parseInt(document.getElementById('batch-size').value);
+    const loraRank = parseInt(document.getElementById('lora-rank').value);
+
+    if (!modelName) {
+        showToast('Bitte Modell wählen', 'warning');
+        return;
+    }
+
+    try {
+        await apiCall('/api/training/start', {
+            method: 'POST',
+            body: JSON.stringify({
+                model_name: modelName,
+                learning_rate: learningRate,
+                epochs: epochs,
+                batch_size: batchSize,
+                lora_rank: loraRank
+            })
+        });
+
+        showToast('Training gestartet', 'success');
+
+        document.getElementById('start-training').disabled = true;
+        document.getElementById('stop-training').disabled = false;
+
+        // Start SSE stream
+        startTrainingStream();
+
+    } catch (error) {
+        console.error('Error starting training:', error);
+    }
+}
+
+async function stopTraining() {
+    try {
+        await apiCall('/api/training/stop', { method: 'POST' });
+        showToast('Training gestoppt', 'warning');
+
+        document.getElementById('start-training').disabled = false;
+        document.getElementById('stop-training').disabled = true;
+
+        if (trainingEventSource) {
+            trainingEventSource.close();
+        }
+    } catch (error) {
+        console.error('Error stopping training:', error);
+    }
+}
+
+function startTrainingStream() {
+    if (trainingEventSource) {
+        trainingEventSource.close();
+    }
+
+    trainingEventSource = new EventSource('/api/training/stream');
+
+    trainingEventSource.onmessage = (event) => {
+        const status = JSON.parse(event.data);
+        updateTrainingStatusUI(status);
+
+        if (!status.is_training && status.current_step > 0) {
+            trainingEventSource.close();
+            document.getElementById('start-training').disabled = false;
+            document.getElementById('stop-training').disabled = true;
+            showToast('Training abgeschlossen', 'success');
+        }
+    };
+
+    trainingEventSource.onerror = () => {
+        trainingEventSource.close();
+    };
+}
+
+async function updateTrainingStatus() {
+    try {
+        const status = await apiCall('/api/training/status');
+        updateTrainingStatusUI(status);
+
+        if (status.is_training) {
+            document.getElementById('start-training').disabled = true;
+            document.getElementById('stop-training').disabled = false;
+            startTrainingStream();
+        }
+    } catch (error) {
+        console.error('Error updating status:', error);
+    }
+}
+
+function updateTrainingStatusUI(status) {
+    const container = document.getElementById('training-status');
+
+    if (!status.is_training && status.current_step === 0) {
+        container.innerHTML = '<p>Kein Training aktiv</p>';
+        return;
+    }
+
+    const eta = status.eta_seconds ? `${Math.floor(status.eta_seconds / 60)}m ${status.eta_seconds % 60}s` : 'N/A';
+
+    container.innerHTML = `
+        <div class="status-grid">
+            <div class="status-item">
+                <label>Status</label>
+                <div class="value">${status.is_training ? '🟢 Running' : '⏸️ Stopped'}</div>
+            </div>
+            <div class="status-item">
+                <label>Step</label>
+                <div class="value">${status.current_step} / ${status.total_steps}</div>
+            </div>
+            <div class="status-item">
+                <label>Epoch</label>
+                <div class="value">${status.current_epoch}</div>
+            </div>
+            <div class="status-item">
+                <label>Train Loss</label>
+                <div class="value">${status.train_loss || 'N/A'}</div>
+            </div>
+            <div class="status-item">
+                <label>Val Loss</label>
+                <div class="value">${status.val_loss || 'N/A'}</div>
+            </div>
+            <div class="status-item">
+                <label>ETA</label>
+                <div class="value">${eta}</div>
+            </div>
+            <div class="status-item">
+                <label>Memory</label>
+                <div class="value">${status.memory_usage_percent}%</div>
+            </div>
+        </div>
+    `;
+
+    // Update charts (simple implementation without chart library)
+    updateChart('train-loss-chart', status.train_loss_history);
+    updateChart('val-loss-chart', status.val_loss_history);
+}
+
+function updateChart(canvasId, data) {
+    // Simplified chart rendering (without external library)
+    const canvas = document.getElementById(canvasId);
+    if (!canvas) return;
+
+    const ctx = canvas.getContext('2d');
+    canvas.width = canvas.offsetWidth;
+    canvas.height = 200;
+
+    ctx.clearRect(0, 0, canvas.width, canvas.height);
+
+    if (!data || data.length === 0) return;
+
+    const padding = 20;
+    const width = canvas.width - 2 * padding;
+    const height = canvas.height - 2 * padding;
+
+    const maxVal = Math.max(...data);
+    const minVal = Math.min(...data);
+    const range = maxVal - minVal || 1;
+
+    ctx.strokeStyle = '#4a9eff';
+    ctx.lineWidth = 2;
+    ctx.beginPath();
+
+    data.forEach((val, i) => {
+        const x = padding + (i / (data.length - 1)) * width;
+        const y = padding + height - ((val - minVal) / range) * height;
+
+        if (i === 0) {
+            ctx.moveTo(x, y);
+        } else {
+            ctx.lineTo(x, y);
+        }
+    });
+
+    ctx.stroke();
+}
+
+// ======================
+// Evaluation
+// ======================
+
+function initEvaluation() {
+    document.getElementById('load-test-prompt').addEventListener('click', loadTestPrompt);
+    document.getElementById('run-comparison').addEventListener('click', runComparison);
+}
+
+async function loadTestPrompt() {
+    const taskType = document.getElementById('eval-task-type').value;
+
+    try {
+        const prompts = await apiCall('/api/inference/test-prompts');
+        const prompt = prompts[taskType];
+
+        if (prompt) {
+            // Extract mail body from prompt
+            const parts = prompt.split('\n\n');
+            document.getElementById('eval-mail-text').value = parts.slice(1).join('\n\n');
+            showToast('Test-Beispiel geladen', 'success');
+        }
+    } catch (error) {
+        console.error('Error loading test prompt:', error);
+    }
+}
+
+async function runComparison() {
+    const taskType = document.getElementById('eval-task-type').value;
+    const mailBody = document.getElementById('eval-mail-text').value;
+
+    if (!mailBody) {
+        showToast('Bitte Mail-Text eingeben', 'warning');
+        return;
+    }
+
+    document.getElementById('base-result').textContent = 'Generiere...';
+    document.getElementById('finetuned-result').textContent = 'Generiere...';
+
+    try {
+        const result = await apiCall('/api/inference/compare', {
+            method: 'POST',
+            body: JSON.stringify({
+                task_type: taskType,
+                mail_body: mailBody
+            })
+        });
+
+        document.getElementById('base-result').textContent = result.base || 'Modell nicht geladen';
+        document.getElementById('finetuned-result').textContent = result.finetuned || 'Modell nicht geladen';
+
+        showToast('Vergleich abgeschlossen', 'success');
+    } catch (error) {
+        console.error('Error running comparison:', error);
+        document.getElementById('base-result').textContent = 'Fehler';
+        document.getElementById('finetuned-result').textContent = 'Fehler';
+    }
+}
+
+// ======================
+// Init
+// ======================
+
+document.addEventListener('DOMContentLoaded', () => {
+    initNavigation();
+    initImport();
+    initLabeling();
+    initExport();
+    initTraining();
+    initEvaluation();
+});
@@ -0,0 +1,254 @@
+<!DOCTYPE html>
+<html lang="de">
+<head>
+    <meta charset="UTF-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+    <title>Mail Fine-Tuning App</title>
+    <link rel="stylesheet" href="style.css">
+</head>
+<body>
+    <div class="app-container">
+        <!-- Sidebar Navigation -->
+        <nav class="sidebar">
+            <h1>Mail Fine-Tuning</h1>
+            <ul class="nav-menu">
+                <li><a href="#" data-view="import" class="nav-link active">📥 Mail Import</a></li>
+                <li><a href="#" data-view="labeling" class="nav-link">🏷️ Labeling</a></li>
+                <li><a href="#" data-view="export" class="nav-link">📊 Export & Stats</a></li>
+                <li><a href="#" data-view="models" class="nav-link">🤖 Modelle</a></li>
+                <li><a href="#" data-view="training" class="nav-link">🎯 Training</a></li>
+                <li><a href="#" data-view="evaluation" class="nav-link">🧪 Evaluation</a></li>
+            </ul>
+        </nav>
+
+        <!-- Main Content -->
+        <main class="main-content">
+
+            <!-- Import View -->
+            <div id="import-view" class="view active">
+                <h2>Mail Import</h2>
+
+                <div class="upload-section">
+                    <div class="dropzone" id="dropzone">
+                        <p>📂 Dateien hier ablegen oder klicken</p>
+                        <p class="hint">Unterstützt: .eml, .mbox, .txt</p>
+                        <input type="file" id="file-input" multiple accept=".eml,.mbox,.txt" hidden>
+                    </div>
+                </div>
+
+                <div class="mail-list-section">
+                    <div class="section-header">
+                        <h3>Importierte Mails (<span id="mail-count">0</span>)</h3>
+                        <button id="refresh-mails" class="btn btn-secondary">🔄 Aktualisieren</button>
+                    </div>
+                    <div id="mail-list" class="mail-list">
+                        <!-- Mails werden hier eingefügt -->
+                    </div>
+                </div>
+            </div>
+
+            <!-- Labeling View -->
+            <div id="labeling-view" class="view">
+                <div class="section-header">
+                    <h2>Mail Labeling</h2>
+                    <div class="filter-controls">
+                        <select id="status-filter">
+                            <option value="">Alle anzeigen</option>
+                            <option value="unlabeled" selected>Nur Unlabeled</option>
+                            <option value="labeled">Nur Labeled</option>
+                            <option value="skip">Übersprungen</option>
+                        </select>
+                    </div>
+                </div>
+
+                <div class="progress-bar">
+                    <div class="progress-fill" id="labeling-progress"></div>
+                    <span class="progress-text" id="progress-text">0 / 0 gelabelt</span>
+                </div>
+
+                <div class="keyboard-hints">
+                    Shortcuts: <kbd>N</kbd> Nächste | <kbd>S</kbd> Speichern | <kbd>K</kbd> Skip
+                </div>
+
+                <div id="labeling-container">
+                    <!-- Labeling Interface wird hier geladen -->
+                </div>
+            </div>
+
+            <!-- Export View -->
+            <div id="export-view" class="view">
+                <h2>Daten Export & Statistiken</h2>
+
+                <div class="stats-grid" id="stats-grid">
+                    <!-- Stats werden hier eingefügt -->
+                </div>
+
+                <div class="export-section">
+                    <h3>Training-Daten exportieren</h3>
+                    <div class="export-controls">
+                        <label>
+                            Train/Val Split:
+                            <input type="number" id="train-split" value="90" min="50" max="95" step="5">%
+                        </label>
+                        <button id="export-jsonl" class="btn btn-primary">📦 JSONL generieren</button>
+                    </div>
+                    <div id="export-result"></div>
+                </div>
+            </div>
+
+            <!-- Models View -->
+            <div id="models-view" class="view">
+                <h2>Modell-Verwaltung</h2>
+
+                <div class="model-section">
+                    <h3>Verfügbare Modelle</h3>
+                    <div id="models-list" class="models-list">
+                        <!-- Modelle werden hier geladen -->
+                    </div>
+
+                    <div class="model-download">
+                        <h3>Modell herunterladen</h3>
+                        <p class="info-text">
+                            Modelle müssen manuell heruntergeladen werden. Empfohlen:
+                        </p>
+                        <ul>
+                            <li>mlx-community/Mistral-7B-Instruct-v0.3-4bit</li>
+                            <li>mlx-community/Meta-Llama-3-8B-Instruct-4bit</li>
+                        </ul>
+                        <p class="code-example">
+                            huggingface-cli download [model-name] --local-dir models/[model-name]
+                        </p>
+                    </div>
+                </div>
+            </div>
+
+            <!-- Training View -->
+            <div id="training-view" class="view">
+                <h2>Training</h2>
+
+                <div class="training-config">
+                    <h3>Konfiguration</h3>
+                    <form id="training-form">
+                        <div class="form-group">
+                            <label>Modell:</label>
+                            <select id="training-model" required>
+                                <option value="">-- Modell wählen --</option>
+                            </select>
+                        </div>
+
+                        <div class="form-group">
+                            <label>
+                                Learning Rate: <span id="lr-value">1e-5</span>
+                            </label>
+                            <input type="range" id="learning-rate"
+                                   min="-6" max="-4" step="0.1" value="-5">
+                        </div>
+
+                        <div class="form-group">
+                            <label>
+                                Epochs: <span id="epochs-value">3</span>
+                            </label>
+                            <input type="range" id="epochs"
+                                   min="1" max="10" value="3">
+                        </div>
+
+                        <div class="form-group">
+                            <label>Batch Size:</label>
+                            <select id="batch-size">
+                                <option value="1">1</option>
+                                <option value="2">2</option>
+                                <option value="4" selected>4</option>
+                                <option value="8">8</option>
+                            </select>
+                        </div>
+
+                        <div class="form-group">
+                            <label>LoRA Rank:</label>
+                            <select id="lora-rank">
+                                <option value="4">4</option>
+                                <option value="8" selected>8</option>
+                                <option value="16">16</option>
+                                <option value="32">32</option>
+                            </select>
+                        </div>
+
+                        <div class="form-actions">
+                            <button type="submit" class="btn btn-primary" id="start-training">
+                                ▶️ Training starten
+                            </button>
+                            <button type="button" class="btn btn-danger" id="stop-training" disabled>
+                                ⏹️ Training stoppen
+                            </button>
+                        </div>
+                    </form>
+                </div>
+
+                <div class="training-status" id="training-status">
+                    <!-- Training Status wird hier angezeigt -->
+                </div>
+
+                <div class="training-charts">
+                    <div class="chart-container">
+                        <h4>Training Loss</h4>
+                        <canvas id="train-loss-chart"></canvas>
+                    </div>
+                    <div class="chart-container">
+                        <h4>Validation Loss</h4>
+                        <canvas id="val-loss-chart"></canvas>
+                    </div>
+                </div>
+            </div>
+
+            <!-- Evaluation View -->
+            <div id="evaluation-view" class="view">
+                <h2>Modell Evaluation</h2>
+
+                <div class="eval-controls">
+                    <h3>Chat Interface</h3>
+                    <div class="form-group">
+                        <label>Task Type:</label>
+                        <select id="eval-task-type">
+                            <option value="Zusammenfassen">Zusammenfassen</option>
+                            <option value="Antwort schreiben">Antwort schreiben</option>
+                            <option value="Kategorisieren">Kategorisieren</option>
+                            <option value="Action Items">Action Items</option>
+                            <option value="Custom">Custom</option>
+                        </select>
+                    </div>
+
+                    <div class="form-group">
+                        <label>Mail-Text:</label>
+                        <textarea id="eval-mail-text" rows="6" placeholder="Mail-Text hier eingeben..."></textarea>
+                    </div>
+
+                    <div class="form-group">
+                        <button id="load-test-prompt" class="btn btn-secondary">📝 Test-Beispiel laden</button>
+                        <button id="run-comparison" class="btn btn-primary">🔍 Vergleich starten</button>
+                    </div>
+                </div>
+
+                <div class="comparison-results">
+                    <div class="result-box">
+                        <h4>Base Model</h4>
+                        <div id="base-result" class="result-content">
+                            Noch kein Ergebnis
+                        </div>
+                    </div>
+                    <div class="result-box">
+                        <h4>Fine-tuned Model</h4>
+                        <div id="finetuned-result" class="result-content">
+                            Noch kein Ergebnis
+                        </div>
+                    </div>
+                </div>
+            </div>
+
+        </main>
+    </div>
+
+    <!-- Toast Notifications -->
+    <div id="toast-container"></div>
+
+    <script src="app.js"></script>
+</body>
+</html>
@@ -0,0 +1,600 @@
+/* Mail Fine-Tuning App Styles */
+
+:root {
+    --bg-primary: #1a1a1a;
+    --bg-secondary: #2d2d2d;
+    --bg-tertiary: #3a3a3a;
+    --text-primary: #e0e0e0;
+    --text-secondary: #b0b0b0;
+    --accent-primary: #4a9eff;
+    --accent-success: #4caf50;
+    --accent-warning: #ff9800;
+    --accent-danger: #f44336;
+    --border-color: #444;
+}
+
+* {
+    margin: 0;
+    padding: 0;
+    box-sizing: border-box;
+}
+
+body {
+    font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
+    background: var(--bg-primary);
+    color: var(--text-primary);
+    line-height: 1.6;
+}
+
+.app-container {
+    display: flex;
+    height: 100vh;
+    overflow: hidden;
+}
+
+/* Sidebar */
+.sidebar {
+    width: 250px;
+    background: var(--bg-secondary);
+    padding: 2rem 1rem;
+    border-right: 1px solid var(--border-color);
+}
+
+.sidebar h1 {
+    font-size: 1.5rem;
+    margin-bottom: 2rem;
+    color: var(--accent-primary);
+}
+
+.nav-menu {
+    list-style: none;
+}
+
+.nav-link {
+    display: block;
+    padding: 0.75rem 1rem;
+    color: var(--text-secondary);
+    text-decoration: none;
+    border-radius: 4px;
+    margin-bottom: 0.5rem;
+    transition: all 0.2s;
+}
+
+.nav-link:hover {
+    background: var(--bg-tertiary);
+    color: var(--text-primary);
+}
+
+.nav-link.active {
+    background: var(--accent-primary);
+    color: white;
+}
+
+/* Main Content */
+.main-content {
+    flex: 1;
+    overflow-y: auto;
+    padding: 2rem;
+}
+
+.view {
+    display: none;
+}
+
+.view.active {
+    display: block;
+}
+
+h2 {
+    margin-bottom: 1.5rem;
+    color: var(--text-primary);
+}
+
+h3 {
+    margin-bottom: 1rem;
+    color: var(--text-primary);
+}
+
+/* Buttons */
+.btn {
+    padding: 0.6rem 1.2rem;
+    border: none;
+    border-radius: 4px;
+    cursor: pointer;
+    font-size: 0.9rem;
+    transition: all 0.2s;
+}
+
+.btn-primary {
+    background: var(--accent-primary);
+    color: white;
+}
+
+.btn-primary:hover {
+    background: #3a8eef;
+}
+
+.btn-secondary {
+    background: var(--bg-tertiary);
+    color: var(--text-primary);
+}
+
+.btn-secondary:hover {
+    background: #4a4a4a;
+}
+
+.btn-success {
+    background: var(--accent-success);
+    color: white;
+}
+
+.btn-danger {
+    background: var(--accent-danger);
+    color: white;
+}
+
+.btn:disabled {
+    opacity: 0.5;
+    cursor: not-allowed;
+}
+
+/* Upload Section */
+.dropzone {
+    border: 2px dashed var(--border-color);
+    border-radius: 8px;
+    padding: 3rem;
+    text-align: center;
+    cursor: pointer;
+    transition: all 0.2s;
+    margin-bottom: 2rem;
+}
+
+.dropzone:hover {
+    border-color: var(--accent-primary);
+    background: var(--bg-secondary);
+}
+
+.dropzone.dragover {
+    border-color: var(--accent-primary);
+    background: var(--bg-tertiary);
+}
+
+.hint {
+    font-size: 0.85rem;
+    color: var(--text-secondary);
+    margin-top: 0.5rem;
+}
+
+/* Section Header */
+.section-header {
+    display: flex;
+    justify-content: space-between;
+    align-items: center;
+    margin-bottom: 1rem;
+}
+
+/* Mail List */
+.mail-list {
+    background: var(--bg-secondary);
+    border-radius: 8px;
+    padding: 1rem;
+    max-height: 500px;
+    overflow-y: auto;
+}
+
+.mail-item {
+    background: var(--bg-tertiary);
+    padding: 1rem;
+    margin-bottom: 0.5rem;
+    border-radius: 4px;
+    border-left: 3px solid transparent;
+}
+
+.mail-item.labeled {
+    border-left-color: var(--accent-success);
+}
+
+.mail-item.unlabeled {
+    border-left-color: var(--accent-warning);
+}
+
+.mail-item.skip {
+    border-left-color: var(--text-secondary);
+}
+
+.mail-header {
+    display: flex;
+    justify-content: space-between;
+    margin-bottom: 0.5rem;
+}
+
+.mail-subject {
+    font-weight: bold;
+    color: var(--text-primary);
+}
+
+.mail-meta {
+    font-size: 0.85rem;
+    color: var(--text-secondary);
+}
+
+.mail-body {
+    font-size: 0.9rem;
+    color: var(--text-secondary);
+    overflow: hidden;
+    text-overflow: ellipsis;
+    display: -webkit-box;
+    -webkit-line-clamp: 2;
+    -webkit-box-orient: vertical;
+}
+
+.mail-actions {
+    margin-top: 0.5rem;
+    display: flex;
+    gap: 0.5rem;
+}
+
+.mail-actions button {
+    padding: 0.4rem 0.8rem;
+    font-size: 0.8rem;
+}
+
+/* Labeling Interface */
+#labeling-container {
+    background: var(--bg-secondary);
+    border-radius: 8px;
+    padding: 2rem;
+    margin-top: 1rem;
+}
+
+.current-mail {
+    background: var(--bg-tertiary);
+    padding: 1.5rem;
+    border-radius: 4px;
+    margin-bottom: 1.5rem;
+}
+
+.form-group {
+    margin-bottom: 1.5rem;
+}
+
+.form-group label {
+    display: block;
+    margin-bottom: 0.5rem;
+    color: var(--text-primary);
+    font-weight: 500;
+}
+
+.form-group input,
+.form-group select,
+.form-group textarea {
+    width: 100%;
+    padding: 0.6rem;
+    background: var(--bg-primary);
+    border: 1px solid var(--border-color);
+    border-radius: 4px;
+    color: var(--text-primary);
+    font-family: inherit;
+}
+
+.form-group textarea {
+    resize: vertical;
+    min-height: 100px;
+}
+
+.form-actions {
+    display: flex;
+    gap: 1rem;
+    margin-top: 1rem;
+}
+
+/* Progress Bar */
+.progress-bar {
+    background: var(--bg-secondary);
+    border-radius: 4px;
+    height: 30px;
+    position: relative;
+    margin-bottom: 1rem;
+    overflow: hidden;
+}
+
+.progress-fill {
+    background: var(--accent-primary);
+    height: 100%;
+    transition: width 0.3s;
+}
+
+.progress-text {
+    position: absolute;
+    top: 50%;
+    left: 50%;
+    transform: translate(-50%, -50%);
+    font-weight: bold;
+    color: var(--text-primary);
+}
+
+/* Keyboard Hints */
+.keyboard-hints {
+    font-size: 0.85rem;
+    color: var(--text-secondary);
+    margin-bottom: 1rem;
+}
+
+kbd {
+    background: var(--bg-tertiary);
+    padding: 0.2rem 0.5rem;
+    border-radius: 3px;
+    border: 1px solid var(--border-color);
+    font-family: monospace;
+}
+
+/* Stats Grid */
+.stats-grid {
+    display: grid;
+    grid-template-columns: repeat(auto-fit, minmax(200px, 1fr));
+    gap: 1rem;
+    margin-bottom: 2rem;
+}
+
+.stat-card {
+    background: var(--bg-secondary);
+    padding: 1.5rem;
+    border-radius: 8px;
+    text-align: center;
+}
+
+.stat-value {
+    font-size: 2rem;
+    font-weight: bold;
+    color: var(--accent-primary);
+}
+
+.stat-label {
+    color: var(--text-secondary);
+    font-size: 0.9rem;
+}
+
+/* Export Section */
+.export-section {
+    background: var(--bg-secondary);
+    padding: 1.5rem;
+    border-radius: 8px;
+}
+
+.export-controls {
+    display: flex;
+    gap: 1rem;
+    align-items: center;
+    margin-bottom: 1rem;
+}
+
+.export-controls input {
+    width: 80px;
+    padding: 0.4rem;
+    background: var(--bg-primary);
+    border: 1px solid var(--border-color);
+    color: var(--text-primary);
+    border-radius: 4px;
+}
+
+#export-result {
+    margin-top: 1rem;
+    padding: 1rem;
+    background: var(--bg-tertiary);
+    border-radius: 4px;
+    display: none;
+}
+
+#export-result.show {
+    display: block;
+}
+
+/* Models List */
+.models-list {
+    background: var(--bg-secondary);
+    padding: 1rem;
+    border-radius: 8px;
+    margin-bottom: 2rem;
+}
+
+.model-item {
+    background: var(--bg-tertiary);
+    padding: 1rem;
+    margin-bottom: 0.5rem;
+    border-radius: 4px;
+    display: flex;
+    justify-content: space-between;
+    align-items: center;
+}
+
+.model-download {
+    background: var(--bg-secondary);
+    padding: 1.5rem;
+    border-radius: 8px;
+}
+
+.info-text {
+    color: var(--text-secondary);
+    margin-bottom: 1rem;
+}
+
+.code-example {
+    background: var(--bg-primary);
+    padding: 1rem;
+    border-radius: 4px;
+    font-family: monospace;
+    color: var(--accent-primary);
+    margin-top: 1rem;
+}
+
+/* Training Status */
+.training-status {
+    background: var(--bg-secondary);
+    padding: 1.5rem;
+    border-radius: 8px;
+    margin: 1.5rem 0;
+}
+
+.status-grid {
+    display: grid;
+    grid-template-columns: repeat(auto-fit, minmax(200px, 1fr));
+    gap: 1rem;
+}
+
+.status-item {
+    background: var(--bg-tertiary);
+    padding: 1rem;
+    border-radius: 4px;
+}
+
+.status-item label {
+    display: block;
+    color: var(--text-secondary);
+    font-size: 0.85rem;
+    margin-bottom: 0.3rem;
+}
+
+.status-item .value {
+    font-size: 1.2rem;
+    font-weight: bold;
+    color: var(--accent-primary);
+}
+
+/* Training Charts */
+.training-charts {
+    display: grid;
+    grid-template-columns: 1fr 1fr;
+    gap: 1.5rem;
+    margin-top: 1.5rem;
+}
+
+.chart-container {
+    background: var(--bg-secondary);
+    padding: 1.5rem;
+    border-radius: 8px;
+}
+
+.chart-container h4 {
+    margin-bottom: 1rem;
+    color: var(--text-primary);
+}
+
+canvas {
+    width: 100% !important;
+    height: 200px !important;
+}
+
+/* Evaluation */
+.comparison-results {
+    display: grid;
+    grid-template-columns: 1fr 1fr;
+    gap: 1.5rem;
+    margin-top: 1.5rem;
+}
+
+.result-box {
+    background: var(--bg-secondary);
+    padding: 1.5rem;
+    border-radius: 8px;
+}
+
+.result-content {
+    background: var(--bg-primary);
+    padding: 1rem;
+    border-radius: 4px;
+    min-height: 150px;
+    white-space: pre-wrap;
+    font-family: monospace;
+    font-size: 0.9rem;
+}
+
+/* Filter Controls */
+.filter-controls {
+    display: flex;
+    gap: 1rem;
+}
+
+.filter-controls select {
+    padding: 0.5rem;
+    background: var(--bg-tertiary);
+    border: 1px solid var(--border-color);
+    color: var(--text-primary);
+    border-radius: 4px;
+}
+
+/* Toast Notifications */
+#toast-container {
+    position: fixed;
+    top: 1rem;
+    right: 1rem;
+    z-index: 1000;
+}
+
+.toast {
+    background: var(--bg-secondary);
+    border: 1px solid var(--border-color);
+    border-left: 4px solid var(--accent-primary);
+    padding: 1rem 1.5rem;
+    border-radius: 4px;
+    margin-bottom: 0.5rem;
+    min-width: 300px;
+    animation: slideIn 0.3s ease;
+}
+
+.toast.success {
+    border-left-color: var(--accent-success);
+}
+
+.toast.error {
+    border-left-color: var(--accent-danger);
+}
+
+.toast.warning {
+    border-left-color: var(--accent-warning);
+}
+
+@keyframes slideIn {
+    from {
+        transform: translateX(400px);
+        opacity: 0;
+    }
+    to {
+        transform: translateX(0);
+        opacity: 1;
+    }
+}
+
+/* Scrollbar */
+::-webkit-scrollbar {
+    width: 8px;
+    height: 8px;
+}
+
+::-webkit-scrollbar-track {
+    background: var(--bg-secondary);
+}
+
+::-webkit-scrollbar-thumb {
+    background: var(--bg-tertiary);
+    border-radius: 4px;
+}
+
+::-webkit-scrollbar-thumb:hover {
+    background: #4a4a4a;
+}
+
+/* Responsive */
+@media (max-width: 768px) {
+    .sidebar {
+        width: 200px;
+    }
+
+    .comparison-results,
+    .training-charts {
+        grid-template-columns: 1fr;
+    }
+
+    .stats-grid {
+        grid-template-columns: 1fr;
+    }
+}
@@ -0,0 +1,24 @@
+# Mail Fine-Tuning App Dependencies
+
+# Web Framework
+fastapi==0.109.0
+uvicorn[standard]==0.27.0
+python-multipart==0.0.6
+
+# ML Framework (Apple Silicon)
+mlx==0.6.0
+mlx-lm==0.8.0
+
+# Mail Parsing
+beautifulsoup4==4.12.3
+chardet==5.2.0
+
+# Database
+aiosqlite==0.19.0
+
+# Utilities
+aiofiles==23.2.1
+psutil==5.9.8
+
+# Optional but recommended
+huggingface-hub==0.20.3
@@ -0,0 +1,35 @@
+#!/bin/bash
+
+# Mail Fine-Tuning App Startup Script
+
+echo "🚀 Starting Mail Fine-Tuning App..."
+echo ""
+
+# Check if venv exists
+if [ ! -d "venv" ]; then
+    echo "❌ Virtual environment not found!"
+    echo "Please run: python3 -m venv venv && source venv/bin/activate && pip install -r requirements.txt"
+    exit 1
+fi
+
+# Activate venv
+source venv/bin/activate
+
+# Check if dependencies are installed
+if ! python -c "import fastapi" 2>/dev/null; then
+    echo "❌ Dependencies not installed!"
+    echo "Please run: pip install -r requirements.txt"
+    exit 1
+fi
+
+# Create necessary directories
+mkdir -p data models output
+
+# Start server
+echo "✅ Starting server on http://localhost:8000"
+echo ""
+echo "Press Ctrl+C to stop"
+echo ""
+
+cd backend
+python main.py