Fine-Tuning APIs

Launch distributed cluster training jobs via the Acadify API. Full support for LoRA, QLoRA, and Full Parameter SFT.

LoRA & QLoRA

Parameter-efficient fine tuning for massive base models (Llama 3, Mistral).

Distributed FSDP

Full parameter training distributed across 8x H100 clusters seamlessly.

1. Launching a Training Job

After your datasets have been verified by our SME network, you can kick off a training run using the `/v1/fine_tune/jobs` endpoint.

curl -X POST https://api.acadifysolution.com/v1/fine_tune/jobs \
  -H "Authorization: Bearer aca_live_abc123" \
  -H "Content-Type: application/json" \
  -d '{
    "base_model": "mistralai/Mistral-7B-v0.1",
    "dataset_id": "dataset_998x2",
    "method": "qlora",
    "hyperparameters": {
        "learning_rate": 2e-4,
        "batch_size": 16,
        "epochs": 3,
        "lora_r": 64,
        "lora_alpha": 16
    }
  }'

2. Webhooks & Checkpoints

Training jobs can take days. Acadify emits webhooks at the end of every epoch, including the evaluation loss metric, allowing you to build real-time dashboards in your own application.

ft.job.started: Cluster allocated and weights downloading.
ft.epoch.completed: Contains the latest eval_loss.
ft.job.succeeded: Training complete, adapter merged and exported to S3.