Hacker Newsnew | past | comments | ask | show | jobs | submit | convexly's commentslogin

I ran this analysis using Convexly, a calibration-tracking tool I'm working on. The Brier scoring math in the post is the same math that runs in the product. Disclosure out of the way.

What I did: pulled trade data for the top 100 Polymarket profit wallets from the public API, resolved 3,651 unique positions, and computed per-wallet Brier scores. Then tested the correlation between calibration and profit using Spearman rank correlation (not Pearson, because the profit distribution is fat-tailed, Hill alpha ~1.6).

The headline finding: Spearman r = +0.608 between Brier score and realized profit. Worse calibration predicts bigger profits. The correlation gets stronger when you drop the top 10 by profit (+0.72), so it isn't outlier-driven. The worst-calibrated whales earn 4.66x the median profit of the better-calibrated whales.

The leaderboard is a concentration ranking. Median single-event concentration: 69.8%. Twenty of the top 100 made their biggest money on the 2024 election.

After separating the four wallets Chainalysis publicly attributed to "Theo" (the French trader who commissioned private YouGov polls), 8 wallets remain in a narrow cluster on the popular-vote markets in a 3-week window around election day. Chainalysis did not link BetTom42 or alexmulti to Theo.

Full 47-column CSV is downloadable from the post. Happy to answer methodology questions.


Quick update: 1,934 quiz completions, 44.5% scored overconfident. Most interesting finding was that the quiz itself got more engagement than the product behind it. Added educational tooltips, a public roadmap with voting, and UTM tracking based on feedback here and from users that reached out directly!


Both really good points. The research does suggest that the core skill does transfer. The quiz can help with long horizon predictions. The mechanism itself seems to be the actual awareness of overconfidence rather than just domain-specific knowledge. With that being said, the gap between the quiz and real-world application is real, and tracking both over time is part of why I built the decision logging side. For your question about teams, that's a built-in feature already! Submissions are "sealed" so you submit before seeing others. The team feature also has a believability-weighted aggregation based on each submitter's track record, and I also built an IC mode for investment committees. The problem you describe about one calibrated person in a room with two uncalibrated ones is what the sealed model prevents. Everyone makes draws their own conclusion, then they compare!


Thank you! Happy to hear to how it compares!


That's all true. I'm a solo founder and have been using Claude heavily to build this. It definitely shows in many places, and I'll make sure to clean those up. I did not expect to get this many visits from a show HN (almost at 1600 quiz takers from the last few hours alone). The core math is sound, but I agree the presentation needs more care. Appreciate the honest feedback!


The change to buttons was based on feedback I got today. The slider disappearing is a bug. Pushing a fix now!


Made a few changes based on feedback from this thread: full results now shown immediately with no email gate, changed the UX to include true/false/uncertain buttons + a confidence slider, I cleaned up the quiz result page, and fixed the die probability question. Thanks for all the honest feedback!


Update at 2 hours: 1350+ quiz takers! 50% overconfident, 40% well-calibrated, and 10% underconfident. The average score is around 0.228, with the best score still at 0.007 (nearly perfect). The pattern so far is people are most overconfident in the 70-90% range, but are right closer to ~55% of the time.


Great recommendation. That was one of the biggest influences for starting to write my decisions down and then building this.


That's fair, I'll flag those or maybe even add regional context. Nice score, well above average!


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: