1456995462
Implemented a full-stack web application for fine-tuning LLMs on email data, optimized for Apple Silicon (M4 Pro with 24GB RAM). Features: - Mail import with drag & drop support (.mbox, .eml, .txt) - Automated mail cleaning and preprocessing - Interactive labeling interface with keyboard shortcuts - Training data export to JSONL format - MLX-based LoRA fine-tuning with live updates - Model evaluation and comparison interface - Server-Sent Events for real-time training progress - Dark theme UI optimized for extended use Technical Stack: - Backend: FastAPI with SQLite database - Frontend: Vanilla HTML/CSS/JavaScript (no external dependencies) - ML Framework: MLX for Apple Silicon optimization - Models: Support for Mistral 7B and Llama 3 8B via MLX Components: - data_manager.py: SQLite operations for mail storage and labeling - mail_parser.py: Parser for multiple mail formats with cleaning - training.py: MLX training wrapper with LoRA support - inference.py: Model loading and inference for evaluation - main.py: FastAPI backend with REST API and SSE - Frontend: Complete UI with all features Documentation: - Comprehensive README with installation and usage guide - Quick-start guide for rapid setup - Example mails for testing - Troubleshooting and best practices Ready for local deployment and fine-tuning workflows.
37 lines
274 B
Plaintext
37 lines
274 B
Plaintext
# Python
|
|
__pycache__/
|
|
*.py[cod]
|
|
*$py.class
|
|
*.so
|
|
.Python
|
|
venv/
|
|
env/
|
|
ENV/
|
|
|
|
# Data
|
|
data/*.db
|
|
data/*.jsonl
|
|
data/temp/
|
|
|
|
# Models
|
|
models/*
|
|
!models/.gitkeep
|
|
|
|
# Training outputs
|
|
output/*
|
|
!output/.gitkeep
|
|
|
|
# IDE
|
|
.vscode/
|
|
.idea/
|
|
*.swp
|
|
*.swo
|
|
*~
|
|
|
|
# OS
|
|
.DS_Store
|
|
Thumbs.db
|
|
|
|
# Logs
|
|
*.log
|