MedicalBenchmark
OpenAI: o3 Deep Research provider

o3 Deep Research

23

#23 of 319 models in the general ranking

Cumulative performance across 3 MIR exams

Net score

576.00 pts

Accuracy

97.0%

Correct / Incorrect

582 / 18

Total Cost

$502.80

Overall Performance

(vs. average)
Accuracy

97.0%

avg: 80.6%

Net score

576.00 pts

avg: 453.30 pts

Correct

582

avg: 483

Incorrect

18

avg: 90

Total Cost

$502.80

avg: $9.58

Average response time

172.3s

avg: 17.9s

Output Tokens

5.5M

avg: 1.3M

Reasoning Tokens

5.0M

avg: 898K

Average confidence

100.0%

avg: 95.4%

Breakdown by Exam

MIR 2024
14
Correct
195
Incorrect
5
Accuracy
97.5%
Net score
193.33
MIR 2025
37
Correct
189
Incorrect
11
Accuracy
94.5%
Net score
185.33
MIR 2026
16
Correct
198
Incorrect
2
Accuracy
99.0%
Net score
197.33