MedicalBenchmark
Microsoft: Phi 4 provider

Phi 4

265

#265 of 319 modelsMIR 2025

Net score

99.66 pts

Accuracy

62.0%

Correct / Incorrect

124 / 73

Total Cost

$0.02

Overall Performance

(vs. average)
Accuracy

62.0%

avg: 77.9%

Net score

99.66 pts

avg: 143.96 pts

Correct

124

avg: 156

Incorrect

73

avg: 35

Total Cost

$0.02

avg: $3.36

Average response time

14.4s

avg: 19.0s

Output Tokens

133K

avg: 430K

Reasoning Tokens

0

avg: 306K

Average confidence

96.7%

avg: 95.2%

Subject Breakdown

Allergology
Correct
2
Incorrect
2
Unanswered
0
Accuracy
50.0%
Average
87.9%
Anesthesiology and Resuscitation
Correct
3
Incorrect
3
Unanswered
0
Accuracy
50.0%
Average
82.3%
Cardiology
Correct
14
Incorrect
8
Unanswered
0
Accuracy
63.6%
Average
78.6%
Dermatology
Correct
7
Incorrect
5
Unanswered
0
Accuracy
58.3%
Average
69.4%
Endocrinology and Nutrition
Correct
12
Incorrect
3
Unanswered
1
Accuracy
75.0%
Average
83.5%
ENT
Correct
5
Incorrect
3
Unanswered
0
Accuracy
62.5%
Average
74.8%
Epidemiology
Correct
2
Incorrect
5
Unanswered
0
Accuracy
28.6%
Average
69.1%
Gastroenterology
Correct
11
Incorrect
10
Unanswered
0
Accuracy
52.4%
Average
74.1%
Genetics
Correct
3
Incorrect
3
Unanswered
0
Accuracy
50.0%
Average
69.5%
Geriatrics
Correct
6
Incorrect
5
Unanswered
0
Accuracy
54.5%
Average
77.5%
Gynecology and Obstetrics
Correct
16
Incorrect
2
Unanswered
1
Accuracy
84.2%
Average
86.7%
Health Planning and Management
Correct
2
Incorrect
0
Unanswered
0
Accuracy
100.0%
Average
82.6%
Hematology
Correct
8
Incorrect
3
Unanswered
0
Accuracy
72.7%
Average
82.7%
Immunology
Correct
6
Incorrect
2
Unanswered
1
Accuracy
66.7%
Average
83.3%
Infectious Diseases
Correct
18
Incorrect
9
Unanswered
0
Accuracy
66.7%
Average
74.9%
Legal Medicine and Bioethics
Correct
2
Incorrect
3
Unanswered
0
Accuracy
40.0%
Average
68.4%
Medical Oncology
Correct
18
Incorrect
6
Unanswered
1
Accuracy
72.0%
Average
87.2%
Nephrology
Correct
11
Incorrect
3
Unanswered
0
Accuracy
78.6%
Average
84.8%
Neurology
Correct
13
Incorrect
7
Unanswered
0
Accuracy
65.0%
Average
77.3%
Ophthalmology
Correct
2
Incorrect
3
Unanswered
0
Accuracy
40.0%
Average
74.2%
Palliative Care
Correct
1
Incorrect
3
Unanswered
0
Accuracy
25.0%
Average
78.6%
Pediatrics
Correct
15
Incorrect
11
Unanswered
0
Accuracy
57.7%
Average
71.9%
Pharmacology
Correct
11
Incorrect
6
Unanswered
0
Accuracy
64.7%
Average
74.1%
Psychiatry
Correct
5
Incorrect
3
Unanswered
0
Accuracy
62.5%
Average
83.0%
Pulmonology
Correct
9
Incorrect
4
Unanswered
1
Accuracy
64.3%
Average
80.4%
Radiology-Emergency
Correct
10
Incorrect
4
Unanswered
0
Accuracy
71.4%
Average
69.4%
Rheumatology
Correct
9
Incorrect
6
Unanswered
0
Accuracy
60.0%
Average
76.6%
Statistics
Correct
1
Incorrect
2
Unanswered
0
Accuracy
33.3%
Average
76.6%
Traumatology
Correct
11
Incorrect
7
Unanswered
0
Accuracy
61.1%
Average
79.3%
Urology
Correct
5
Incorrect
2
Unanswered
0
Accuracy
71.4%
Average
80.7%

Question Type Breakdown

Anatomy
Correct
5
Incorrect
2
Unanswered
0
Accuracy
71.4%
Average
78.6%
Biostatistics
Correct
2
Incorrect
2
Unanswered
0
Accuracy
50.0%
Average
79.8%
Diagnosis
Correct
60
Incorrect
28
Unanswered
0
Accuracy
68.2%
Average
79.9%
Epidemiology
Correct
1
Incorrect
4
Unanswered
0
Accuracy
20.0%
Average
76.7%
Ethics
Correct
1
Incorrect
2
Unanswered
0
Accuracy
33.3%
Average
74.1%
Interpretation
Correct
24
Incorrect
18
Unanswered
0
Accuracy
57.1%
Average
70.7%
Legal
Correct
2
Incorrect
2
Unanswered
0
Accuracy
50.0%
Average
64.6%
Pathophysiology
Correct
12
Incorrect
14
Unanswered
1
Accuracy
44.4%
Average
76.1%
Pharmacology
Correct
10
Incorrect
3
Unanswered
0
Accuracy
76.9%
Average
83.3%
Prevention
Correct
6
Incorrect
5
Unanswered
1
Accuracy
50.0%
Average
75.6%
Prognosis
Correct
5
Incorrect
2
Unanswered
0
Accuracy
71.4%
Average
80.8%
Risk
Correct
3
Incorrect
2
Unanswered
0
Accuracy
60.0%
Average
85.2%
Tests
Correct
19
Incorrect
7
Unanswered
1
Accuracy
70.4%
Average
77.9%
Treatment
Correct
49
Incorrect
30
Unanswered
2
Accuracy
60.5%
Average
77.3%
#AnswerCorrectStatus
1BB
2DA
3CC
4AB
5AA
6CC
7DC
8CA
9AA
10BD
11DD
12DD
13AB
14BD
15CAnnulled
16BB
17AB
18AA
19CC
20AA
21CB
22DD
23AC
24DD
25CC
26DAnnulled
27DC
28BAnnulled
29DD
30BB
31AD
32AA
33DD
34AD
35BB
36DD
37AC
38CC
39DD
40BA
41DD
42CC
43AB
44DD
45BD
46AA
47AA
48CA
49DD
50BB
51DC
52AB
53DD
54DB
55CA
56BAnnulled
57CC
58BB
59DD
60AA
61AA
62DD
63BB
64DD
65BA
66AA
67BB
68BB
69AB
70CA
71DD
72CA
73DD
74BC
75AA
76BB
77BB
78BB
79DC
80CC
81CC
82DD
83CB
84DD
85AC
86CC
87AA
88DD
89BB
90DA
91DB
92DC
93BB
94CC
95CA
96CC
97AD
98AC
99BA
100CC
101BB
102AD
103BA
104CC
105AA
106CC
107BB
108DD
109BB
110CC
111AA
112CC
113AB
114AD
115D
116DC
117AA
118AD
119AC
120DB
121AD
122AC
123CC
124CC
125DD
126BD
127AB
128DD
129BA
130DD
131DD
132BA
133BB
134CC
135CB
136CC
137AA
138DD
139DD
140BB
141AA
142AA
143BB
144BB
145DD
146CC
147AB
148BA
149CA
150AD
151AA
152BA
153B
154BB
155BB
156CC
157AA
158DC
159CC
160CA
161CA
162AAnnulled
163DD
164CC
165AA
166BB
167CC
168DD
169BB
170BB
171CC
172BA
173AA
174BB
175CB
176CC
177CC
178BA
179DD
180AA
181DB
182BC
183DB
184BB
185DB
186DAnnulled
187DC
188DD
189BD
190AA
191BB
192AA
193CC
194AA
195AA
196AA
197BB
198CC
199DD
200CC
201BB
202BA
203AD
204C
205DB
206CD
207AA
208BC
209CC
210BB