SIBYL
Get Started
Contest dates coming soon -- Sign up now to be first in line $5,000 Top Prize
$5,000 SIBYL Break-It Challenge

Break the AI. Win $5,000.

SIBYL is the multi-model AI council built to reduce hallucinations. Your job is to make it fail. Find a real, verifiable mistake, submit the evidence, and the top entry wins $5,000 cash.

Free to enter · No credit card · Human-reviewed · Win Prizes
What Counts as a Winning Failure

Five Failure Categories That Qualify

A valid submission is a specific, verifiable failure -- not an opinion. If your submission does not fall clearly into one of these categories, it will not qualify.

Confidently Wrong Answer

SIBYL states a specific fact with high confidence that is demonstrably false -- wrong date, wrong name, wrong number -- backed by credible sources.

Highest scoring category

Bad Arbitration

SIBYL's multi-model consensus returns a clearly incorrect verdict -- choosing the wrong answer when at least two models had the correct one available.

Must show model-level evidence

Missed Verification

A claim that should have been flagged as unverified or contested is instead marked "Verified" without adequate source backing.

Source quality failure

Weak or False Sourcing

SIBYL cites a source that does not actually support the claim, or the cited source contains the opposite information from what was presented.

Include original source link

Failure to Abstain

SIBYL returns a confident answer on a genuinely contested or unknowable topic where the correct response would be uncertainty or abstention.

Epistemic overconfidence

Quick Checklist

Real factual failure. Trust Score ≥ 80. Two or more authoritative sources. One atomic claim. Receipt ID included.

All five required to qualify
Does Not Count: Style complaints -- Subjective disagreements -- Tone or formatting issues -- Asking SIBYL to do something it was not designed to do -- Opinions presented as facts
How It Works

Four Steps to Win Cash

01

Create a Free Account

Sign up at SibylSays.com. No credit card. No purchase. Contest access is included in the free tier.

free -- no credit card
02

Test SIBYL in Deep Mode

Run factual queries during the contest window. SIBYL returns a Trust Score and a Receipt ID. You need Trust Score >= 80 and Status: Verified.

Trust Score >= 80 required
03

Find a Verified Failure

Spot a false claim that SIBYL marked Verified. Quote it verbatim. Find two or more authoritative sources proving it wrong.

one claim -- two+ sources
04

Submit for Human Review

Upload your Receipt ID and evidence pack. Every submission is reviewed by a human judge within 72 hours. The highest-scoring verified submission wins $5,000 cash.

2 per day -- 30 total max
Scoring Rubric

How Submissions Are Scored

Severity 40 pts

How wrong is the challenged claim? A false core conclusion scores 40. A peripheral error scores 10. Not materially false scores 0.

Impact 35 pts

What happens if a real user acts on this? Legal, medical, financial, or security consequences score highest. Trivial errors score 0.

Submission Quality 25 pts

Is the failure clearly identified, strongly sourced, and reproducible? Strong authoritative sourcing scores 25. Incomplete evidence scores 0.

Live Leaderboard

Current Standings

Updated as submissions are reviewed. Opt in to appear publicly, or compete anonymously.

Loading leaderboard...
FAQ

Common Questions

Is it really free to enter?

Yes. No purchase necessary. Create a free account and start testing. Contest access is included in the free tier.

What counts as a valid failure?

A specific, verifiable factual error that SIBYL marked as Verified with a Trust Score of 80 or higher. Opinion disagreements, style complaints, and subjective issues do not qualify.

How many submissions can I make?

Up to 2 submissions per day, with a maximum of 30 total submissions during the contest period.

What if someone else finds the same failure?

Duplicate failures are judged by the first verified submission only. Earlier timestamp wins. This is why submitting quickly matters.

How long until I hear back?

Every submission is reviewed by a human judge within 72 hours. You will receive a status update by email.

Can I compete anonymously?

Yes. Leaderboard participation is opt-in. You can compete and appear anonymously, or set a public display name.

Who are the judges?

Submissions are reviewed by the SIBYL engineering and research team. All scoring uses the published rubric with no subjective criteria.

When are winners announced?

Winners are announced after the contest closes. Potential winners are notified by email within 14 days of final judging.

Create Your Free Contest Account

Create your account to test SIBYL now, catch real failures, and be ready to submit when the challenge opens.

NO PURCHASE NECESSARY. Open to legal U.S. residents 18+, except FL, NY, and RI. Void where prohibited. See Official Rules for complete details.