The Standard for
Open Cyber LLM Arena
Stop guessing if your LLM is secure or just stubborn.
OCLA is a crowdsourced, privacy-first platform where anyone can contribute to evaluating LLMs on uncensored offensive and defensive cybersecurity capabilities.
The Alignment Trap
Most LLM benchmarks (MMLU, GSM8K) measure general reasoning, not security utility. When you ask a model to help secure a network, does it act as a helpful Red Teamer or does it refuse with a generic safety lecture?
"I cannot assist with checking for vulnerabilities as it may be unethical..."
OCLA exists to quantify the fine line between Helpful Security Assistant and Over-Refusal.
How It Works
A frictionless, privacy-preserving workflow for security researchers.
1. Connect
Point OCLA to your local inference server (Ollama, LM Studio) or enter a provider API key. Keys are stored locally in your browser.
2. Benchmark
Run our curated suite of Red Team & Blue Team prompts. We test for SQLi, XSS, Privilege Escalation knowledge, and defensive coding.
3. Analyze & Share
Get instant scoring. View detailed breakdowns of refusals vs. compliance. Optionally, upload anonymous scores to the global leaderboard.
Privacy is Non-Negotiable
Client-Side Only
We do not proxy your requests. All benchmark traffic goes directly from your browser to your model provider (OpenAI, Anthropic, or Localhost).
Zero Data Retention
We never store your API keys, prompts, or model outputs on our servers. The only data we receive is the final numerical score if you choose to submit it.