MedicalBenchmark
Microsoft: Phi 4 provider

Phi 4

265

#265 of 319 models in the general ranking

Cumulative performance across 3 MIR exams

Net score

317.66 pts

Accuracy

64.3%

Correct / Incorrect

386 / 205

Total Cost

$0.07

Overall Performance

(vs. average)
Accuracy

64.3%

avg: 80.6%

Net score

317.66 pts

avg: 453.30 pts

Correct

386

avg: 483

Incorrect

205

avg: 90

Total Cost

$0.07

avg: $9.58

Average response time

13.2s

avg: 17.9s

Output Tokens

392K

avg: 1.3M

Reasoning Tokens

0

avg: 898K

Average confidence

96.2%

avg: 95.4%

Breakdown by Exam

MIR 2024
271
Correct
125
Incorrect
73
Accuracy
62.5%
Net score
100.66
MIR 2025
265
Correct
124
Incorrect
73
Accuracy
62.0%
Net score
99.66
MIR 2026
265
Correct
137
Incorrect
59
Accuracy
68.5%
Net score
117.33