Acme Learning/

Dashboard

Practice Lab

Real scenarios graded by AI — write prompts, build pipelines, debug agents. The AI grader scores every attempt.

live grading
Active scenarios
12
+3
Across 6 AI domains
Avg AI grade
81%
+4
Last 30 days
Completed today
3
done
Daily streak: 14 days
Grader model
sixfactors-grader-v3
live
Auto-rubric · multi-turn aware
Agents
In progress
Customer-support agent triage
Resolution rate · empathy · escalation accuracy
Turns6 / 12
grader
You routed an angry billing case to tier-1 — escalation criteria not met.
Not yet graded
RAG
Graded
RAG retrieval bench — noisy corpus
Recall@5 · MRR · latency budget · cost / 1k queries
Turns8 / 8
grader
Recall@5 = 0.81 (target 0.80). Latency p95 over budget — try smaller chunks.
AI grade87
Safety
Needs rework
Prompt injection defense
Survives 12 known attacks · no system-prompt leak · no PII echo
Turns5 / 10
grader
3 of 12 attacks succeeded. System prompt leaked via translation roleplay.
AI grade42
Tool Use
Graded
Tool-using agent — flight search
Correct tool selection · arg validation · graceful failure
Turns9 / 9
grader
Clean tool ladder. Add timeout handling on the booking step.
AI grade94
Evals
In progress
Eval-set authoring drill
Coverage · golden quality · regression sensitivity
Turns3 / 6
grader
Goldens skew positive — add 4 adversarial examples.
Not yet graded
Voice
Available
Voice agent — appointment booking
Turn-taking · interruption recovery · slot accuracy
Turns0 / 14
grader
Real-time grading — speak naturally, the AI scores latency and clarity.
Not yet graded