MedicalBenchmark
Meta: Llama 3 8B Instruct provider

Llama 3 8B Instruct

300

#300 of 319 modelsMIR 2025

Net score

38.33 pts

Accuracy

36.0%

Correct / Incorrect

72 / 101

Total Cost

$0.01

Overall Performance

(vs. average)
Accuracy

36.0%

avg: 77.9%

Net score

38.33 pts

avg: 143.96 pts

Correct

72

avg: 156

Incorrect

101

avg: 35

Total Cost

$0.01

avg: $3.36

Average response time

10.7s

avg: 19.0s

Output Tokens

139K

avg: 430K

Reasoning Tokens

0

avg: 306K

Average confidence

86.7%

avg: 95.2%

Subject Breakdown

Allergology
Correct
1
Incorrect
2
Unanswered
1
Accuracy
25.0%
Average
87.9%
Anesthesiology and Resuscitation
Correct
3
Incorrect
3
Unanswered
0
Accuracy
50.0%
Average
82.3%
Cardiology
Correct
8
Incorrect
12
Unanswered
2
Accuracy
36.4%
Average
78.6%
Dermatology
Correct
5
Incorrect
6
Unanswered
1
Accuracy
41.7%
Average
69.4%
Endocrinology and Nutrition
Correct
11
Incorrect
5
Unanswered
0
Accuracy
68.8%
Average
83.5%
ENT
Correct
3
Incorrect
3
Unanswered
2
Accuracy
37.5%
Average
74.8%
Epidemiology
Correct
1
Incorrect
4
Unanswered
2
Accuracy
14.3%
Average
69.1%
Gastroenterology
Correct
6
Incorrect
10
Unanswered
5
Accuracy
28.6%
Average
74.1%
Genetics
Correct
3
Incorrect
3
Unanswered
0
Accuracy
50.0%
Average
69.5%
Geriatrics
Correct
3
Incorrect
8
Unanswered
0
Accuracy
27.3%
Average
77.5%
Gynecology and Obstetrics
Correct
9
Incorrect
7
Unanswered
3
Accuracy
47.4%
Average
86.7%
Health Planning and Management
Correct
1
Incorrect
1
Unanswered
0
Accuracy
50.0%
Average
82.6%
Hematology
Correct
9
Incorrect
2
Unanswered
0
Accuracy
81.8%
Average
82.7%
Immunology
Correct
3
Incorrect
4
Unanswered
2
Accuracy
33.3%
Average
83.3%
Infectious Diseases
Correct
12
Incorrect
10
Unanswered
5
Accuracy
44.4%
Average
74.9%
Legal Medicine and Bioethics
Correct
1
Incorrect
4
Unanswered
0
Accuracy
20.0%
Average
68.4%
Medical Oncology
Correct
10
Incorrect
12
Unanswered
3
Accuracy
40.0%
Average
87.2%
Nephrology
Correct
9
Incorrect
4
Unanswered
1
Accuracy
64.3%
Average
84.8%
Neurology
Correct
7
Incorrect
9
Unanswered
4
Accuracy
35.0%
Average
77.3%
Ophthalmology
Correct
1
Incorrect
2
Unanswered
2
Accuracy
20.0%
Average
74.2%
Palliative Care
Correct
0
Incorrect
4
Unanswered
0
Accuracy
0.0%
Average
78.6%
Pediatrics
Correct
9
Incorrect
14
Unanswered
3
Accuracy
34.6%
Average
71.9%
Pharmacology
Correct
7
Incorrect
9
Unanswered
1
Accuracy
41.2%
Average
74.1%
Psychiatry
Correct
3
Incorrect
3
Unanswered
2
Accuracy
37.5%
Average
83.0%
Pulmonology
Correct
3
Incorrect
9
Unanswered
2
Accuracy
21.4%
Average
80.4%
Radiology-Emergency
Correct
3
Incorrect
9
Unanswered
2
Accuracy
21.4%
Average
69.4%
Rheumatology
Correct
6
Incorrect
8
Unanswered
1
Accuracy
40.0%
Average
76.6%
Statistics
Correct
0
Incorrect
1
Unanswered
2
Accuracy
0.0%
Average
76.6%
Traumatology
Correct
5
Incorrect
11
Unanswered
2
Accuracy
27.8%
Average
79.3%
Urology
Correct
4
Incorrect
2
Unanswered
1
Accuracy
57.1%
Average
80.7%

Question Type Breakdown

Anatomy
Correct
3
Incorrect
2
Unanswered
2
Accuracy
42.9%
Average
78.6%
Biostatistics
Correct
1
Incorrect
1
Unanswered
2
Accuracy
25.0%
Average
79.8%
Diagnosis
Correct
34
Incorrect
42
Unanswered
12
Accuracy
38.6%
Average
79.9%
Epidemiology
Correct
1
Incorrect
2
Unanswered
2
Accuracy
20.0%
Average
76.7%
Ethics
Correct
1
Incorrect
2
Unanswered
0
Accuracy
33.3%
Average
74.1%
Interpretation
Correct
9
Incorrect
26
Unanswered
7
Accuracy
21.4%
Average
70.7%
Legal
Correct
1
Incorrect
3
Unanswered
0
Accuracy
25.0%
Average
64.6%
Pathophysiology
Correct
8
Incorrect
15
Unanswered
4
Accuracy
29.6%
Average
76.1%
Pharmacology
Correct
6
Incorrect
5
Unanswered
2
Accuracy
46.2%
Average
83.3%
Prevention
Correct
5
Incorrect
5
Unanswered
2
Accuracy
41.7%
Average
75.6%
Prognosis
Correct
2
Incorrect
3
Unanswered
2
Accuracy
28.6%
Average
80.8%
Risk
Correct
3
Incorrect
2
Unanswered
0
Accuracy
60.0%
Average
85.2%
Tests
Correct
14
Incorrect
10
Unanswered
3
Accuracy
51.9%
Average
77.9%
Treatment
Correct
31
Incorrect
41
Unanswered
9
Accuracy
38.3%
Average
77.3%
#AnswerCorrectStatus
1CB
2BA
3CC
4AB
5AA
6C
7DC
8CA
9A
10AD
11CD
12D
13AB
14D
15DAnnulled
16CB
17AB
18CA
19CC
20DA
21CB
22CD
23AC
24D
25CC
26AAnnulled
27DC
28DAnnulled
29CD
30BB
31BD
32A
33DD
34AD
35B
36DD
37CC
38BC
39CD
40A
41D
42CC
43AB
44DD
45CD
46AA
47AA
48AA
49BD
50BB
51CC
52B
53D
54DB
55CA
56CAnnulled
57CC
58BB
59DD
60BA
61CA
62DD
63CB
64DD
65BA
66AA
67BB
68CB
69AB
70A
71AD
72CA
73D
74DC
75AA
76BB
77BB
78BB
79AC
80CC
81CC
82D
83B
84BD
85DC
86BC
87AA
88CD
89BB
90DA
91DB
92AC
93BB
94C
95DA
96DC
97AD
98AC
99CA
100BC
101CB
102BD
103BA
104CC
105CA
106DC
107BB
108CD
109AB
110AC
111BA
112AC
113BB
114AD
115BD
116BC
117AA
118DD
119C
120AB
121D
122CC
123CC
124BC
125AD
126BD
127AB
128DD
129BA
130DD
131BD
132AA
133BB
134CC
135DB
136CC
137AA
138AD
139D
140CB
141BA
142AA
143CB
144BB
145DD
146AC
147BB
148AA
149CA
150AD
151CA
152DA
153BB
154BB
155BB
156CC
157AA
158AC
159BC
160AA
161DA
162BAnnulled
163DD
164CC
165AA
166BB
167CC
168D
169B
170AB
171CC
172BA
173BA
174BB
175DB
176CC
177CC
178BA
179BD
180CA
181DB
182BC
183DB
184BB
185CB
186AAnnulled
187DC
188DD
189BD
190BA
191CB
192A
193CC
194AA
195A
196AA
197CB
198C
199BD
200C
201B
202BA
203DD
204C
205BB
206CD
207AA
208BC
209CC
210DB