Launch distributed cluster training jobs via the Acadify API. Full support for LoRA, QLoRA, and Full Parameter SFT.
Parameter-efficient fine tuning for massive base models (Llama 3, Mistral).
Full parameter training distributed across 8x H100 clusters seamlessly.
After your datasets have been verified by our SME network, you can kick off a training run using the `/v1/fine_tune/jobs` endpoint.
curl -X POST https://api.acadifysolution.com/v1/fine_tune/jobs \
-H "Authorization: Bearer aca_live_abc123" \
-H "Content-Type: application/json" \
-d '{
"base_model": "mistralai/Mistral-7B-v0.1",
"dataset_id": "dataset_998x2",
"method": "qlora",
"hyperparameters": {
"learning_rate": 2e-4,
"batch_size": 16,
"epochs": 3,
"lora_r": 64,
"lora_alpha": 16
}
}'
Training jobs can take days. Acadify emits webhooks at the end of every epoch, including the evaluation loss metric, allowing you to build real-time dashboards in your own application.
ft.job.started: Cluster allocated and weights downloading.ft.epoch.completed: Contains the latest eval_loss.ft.job.succeeded: Training complete, adapter merged and exported to S3.