Select a domain below to access ultra-detailed APIs, Evaluation Protocols, and Schemas.
The Acadify AI Lab ecosystem is designed as a highly distributed, zero-trust evaluation network tailored for frontier intelligence and foundation models.
At the core of Acadify is the Deterministic Execution Engine (DEE). Unlike traditional benchmarking platforms that rely on static regex matching or simple cross-entropy evaluations, the DEE spins up isolated, ephemeral Kubernetes pods for every single model generation. This allows us to safely execute potentially malicious or hallucinated code generated by an LLM during SWE-bench evaluations without compromising the host infrastructure.
sandbox.create() initialization phase.
Standard crowd-sourcing introduces noise, data poisoning, and hallucinated consensus. Acadify mitigates this by maintaining a proprietary network of top 4% STEM PhDs, FAANG Principal Engineers, and Offensive Security Researchers. When you submit RLHF or SFT datasets via the API, the payloads are routed to specific Domain Nodes.
The Acadify API strictly enforces Bearer Token Authentication across all endpoints. To access any API, you must provision an Enterprise Secret Key via your Acadify Dashboard.
We support multiple environments to ensure development keys do not pollute production evaluation metrics.
| Prefix | Environment | Billing Constraint |
|---|---|---|
aca_test_ |
Development / Sandbox | Free tier, mock SME evaluations. |
aca_live_ |
Production / Enterprise | Billed per evaluation token. Live SME routing. |
Include your secret key in the Authorization header of every HTTP request.
curl https://api.acadifysolution.com/v1/eval/status -H "Authorization: Bearer aca_live_abc123def456" -H "Content-Type: application/json"
If you suspect an aca_live_ key has been leaked (e.g., committed to a public GitHub repository), you must immediately trigger the Key Revocation endpoint. Our system scans public GitHub repositories and will automatically revoke your key if a leak is detected, emitting a key_compromised Webhook event.
Acadify enforces strict rate limits to ensure absolute stability of the Deterministic Execution Engine. Enterprise SLAs override default limits.
Every API response includes specific headers detailing your current quota consumption:
X-RateLimit-Limit: Your absolute maximum RPM (Requests Per Minute).X-RateLimit-Remaining: The number of requests remaining in the current minute window.X-RateLimit-Reset: The Unix timestamp indicating when the quota resets.If you breach your quota, the API will return a 429 Too Many Requests HTTP status code. You must implement exponential backoff logic in your client SDKs. Do not implement aggressive polling, as sustained 429 breaches may result in an automated 24-hour IP ban.
import time
import requests
def make_request_with_backoff(url, headers, max_retries=5):
retries = 0
backoff_factor = 2
while retries < max_retries:
response = requests.get(url, headers=headers)
if response.status_code == 200:
return response.json()
elif response.status_code == 429:
sleep_time = backoff_factor ** retries
print(f"Rate limited. Retrying in {sleep_time} seconds...")
time.sleep(sleep_time)
retries += 1
else:
raise Exception(f"API Error: {response.status_code}")
raise Exception("Max retries exceeded")
Because SWE-bench evaluations and Human-in-the-Loop SME validations can take anywhere from 10 minutes to 72 hours, synchronous API responses are impossible. Acadify utilizes a robust Webhook streaming architecture to notify your systems when an evaluation is complete.
Register your receiving HTTPS endpoint in the Acadify Dashboard. Your endpoint must be capable of receiving POST requests and must return a 200 OK status code within 3 seconds. If your system takes longer than 3 seconds to process the payload, Acadify will assume a timeout and will retry the webhook up to 5 times using exponential backoff.
To ensure webhook payloads actually originate from Acadify and have not been tampered with, every request includes an Acadify-Signature header. This is a HMAC-SHA256 signature generated using your Webhook Secret.
import hmac
import hashlib
def verify_signature(payload_body, secret_token, signature_header):
expected_signature = hmac.new(
key=secret_token.encode(),
msg=payload_body,
digestmod=hashlib.sha256
).hexdigest()
return hmac.compare_digest(expected_signature, signature_header)
| Event String | Description |
|---|---|
eval.sandbox.initialized |
Emitted when the Kubernetes pod successfully boots and clones the target GitHub repo. |
eval.swe_bench.completed |
Emitted when the model finishes its trajectory and a final pass/fail score is calculated. |
data.sme.verified |
Emitted when a Human Subject Matter Expert completes grading an RLHF JSONL row. |
Now that you understand the architecture, authentication, and webhooks, let's initiate your first API request to check the health status of the execution engine.
curl https://api.acadifysolution.com/v1/system/health -H "Authorization: Bearer aca_live_abc123def456"
{
"status": "operational",
"components": {
"execution_engine": "online",
"sme_routing_node": "online",
"webhook_dispatcher": "online"
},
"latency_ms": 12
}
If you receive this response, your authentication is correct, your network is successfully pinging our API gateway, and you are ready to begin deploying models into the execution sandboxes. Navigate to the Evaluation Protocols page to begin your first SWE-bench run.