MedicalBenchmark
Meta: Llama 3.1 8B Instruct provider

Llama 3.1 8B Instruct

290

#290 of 319 modelsMIR 2025

Net score

53.00 pts

Accuracy

39.5%

Correct / Incorrect

79 / 78

Total Cost

$0.02

Overall Performance

(vs. average)
Accuracy

39.5%

avg: 77.9%

Net score

53.00 pts

avg: 143.96 pts

Correct

79

avg: 156

Incorrect

78

avg: 35

Total Cost

$0.02

avg: $3.36

Average response time

20.8s

avg: 19.0s

Output Tokens

234K

avg: 430K

Reasoning Tokens

0

avg: 306K

Average confidence

77.5%

avg: 95.2%

Subject Breakdown

Allergology
Correct
1
Incorrect
3
Unanswered
0
Accuracy
25.0%
Average
87.9%
Anesthesiology and Resuscitation
Correct
1
Incorrect
5
Unanswered
0
Accuracy
16.7%
Average
82.3%
Cardiology
Correct
8
Incorrect
8
Unanswered
6
Accuracy
36.4%
Average
78.6%
Dermatology
Correct
9
Incorrect
2
Unanswered
1
Accuracy
75.0%
Average
69.4%
Endocrinology and Nutrition
Correct
6
Incorrect
8
Unanswered
2
Accuracy
37.5%
Average
83.5%
ENT
Correct
4
Incorrect
1
Unanswered
3
Accuracy
50.0%
Average
74.8%
Epidemiology
Correct
2
Incorrect
5
Unanswered
0
Accuracy
28.6%
Average
69.1%
Gastroenterology
Correct
7
Incorrect
9
Unanswered
5
Accuracy
33.3%
Average
74.1%
Genetics
Correct
0
Incorrect
3
Unanswered
3
Accuracy
0.0%
Average
69.5%
Geriatrics
Correct
7
Incorrect
3
Unanswered
1
Accuracy
63.6%
Average
77.5%
Gynecology and Obstetrics
Correct
9
Incorrect
7
Unanswered
3
Accuracy
47.4%
Average
86.7%
Health Planning and Management
Correct
1
Incorrect
1
Unanswered
0
Accuracy
50.0%
Average
82.6%
Hematology
Correct
5
Incorrect
2
Unanswered
4
Accuracy
45.5%
Average
82.7%
Immunology
Correct
5
Incorrect
2
Unanswered
2
Accuracy
55.6%
Average
83.3%
Infectious Diseases
Correct
11
Incorrect
9
Unanswered
7
Accuracy
40.7%
Average
74.9%
Legal Medicine and Bioethics
Correct
4
Incorrect
1
Unanswered
0
Accuracy
80.0%
Average
68.4%
Medical Oncology
Correct
16
Incorrect
6
Unanswered
3
Accuracy
64.0%
Average
87.2%
Nephrology
Correct
6
Incorrect
4
Unanswered
4
Accuracy
42.9%
Average
84.8%
Neurology
Correct
8
Incorrect
6
Unanswered
6
Accuracy
40.0%
Average
77.3%
Ophthalmology
Correct
2
Incorrect
2
Unanswered
1
Accuracy
40.0%
Average
74.2%
Palliative Care
Correct
2
Incorrect
2
Unanswered
0
Accuracy
50.0%
Average
78.6%
Pediatrics
Correct
14
Incorrect
5
Unanswered
7
Accuracy
53.8%
Average
71.9%
Pharmacology
Correct
8
Incorrect
6
Unanswered
3
Accuracy
47.1%
Average
74.1%
Psychiatry
Correct
5
Incorrect
3
Unanswered
0
Accuracy
62.5%
Average
83.0%
Pulmonology
Correct
2
Incorrect
8
Unanswered
4
Accuracy
14.3%
Average
80.4%
Radiology-Emergency
Correct
2
Incorrect
9
Unanswered
3
Accuracy
14.3%
Average
69.4%
Rheumatology
Correct
7
Incorrect
6
Unanswered
2
Accuracy
46.7%
Average
76.6%
Statistics
Correct
1
Incorrect
2
Unanswered
0
Accuracy
33.3%
Average
76.6%
Traumatology
Correct
6
Incorrect
8
Unanswered
4
Accuracy
33.3%
Average
79.3%
Urology
Correct
4
Incorrect
2
Unanswered
1
Accuracy
57.1%
Average
80.7%

Question Type Breakdown

Anatomy
Correct
3
Incorrect
3
Unanswered
1
Accuracy
42.9%
Average
78.6%
Biostatistics
Correct
2
Incorrect
2
Unanswered
0
Accuracy
50.0%
Average
79.8%
Diagnosis
Correct
32
Incorrect
35
Unanswered
21
Accuracy
36.4%
Average
79.9%
Epidemiology
Correct
2
Incorrect
3
Unanswered
0
Accuracy
40.0%
Average
76.7%
Ethics
Correct
2
Incorrect
1
Unanswered
0
Accuracy
66.7%
Average
74.1%
Interpretation
Correct
10
Incorrect
18
Unanswered
14
Accuracy
23.8%
Average
70.7%
Legal
Correct
3
Incorrect
1
Unanswered
0
Accuracy
75.0%
Average
64.6%
Pathophysiology
Correct
9
Incorrect
9
Unanswered
9
Accuracy
33.3%
Average
76.1%
Pharmacology
Correct
7
Incorrect
6
Unanswered
0
Accuracy
53.8%
Average
83.3%
Prevention
Correct
5
Incorrect
5
Unanswered
2
Accuracy
41.7%
Average
75.6%
Prognosis
Correct
2
Incorrect
4
Unanswered
1
Accuracy
28.6%
Average
80.8%
Risk
Correct
2
Incorrect
2
Unanswered
1
Accuracy
40.0%
Average
85.2%
Tests
Correct
8
Incorrect
15
Unanswered
4
Accuracy
29.6%
Average
77.9%
Treatment
Correct
37
Incorrect
29
Unanswered
15
Accuracy
45.7%
Average
77.3%
#AnswerCorrectStatus
1B
2A
3C
4B
5A
6C
7C
8AA
9CA
10D
11BD
12DD
13AB
14DD
15Annulled
16CB
17B
18A
19CC
20BA
21B
22CD
23AC
24DD
25CC
26CAnnulled
27CC
28DAnnulled
29CD
30B
31D
32A
33D
34AD
35BB
36DD
37DC
38BC
39DD
40BA
41DD
42CC
43CB
44D
45BD
46AA
47AA
48DA
49DD
50BB
51DC
52CB
53DD
54DB
55AA
56Annulled
57CC
58AB
59DD
60BA
61CA
62DD
63BB
64DD
65A
66A
67BB
68B
69CB
70AA
71DD
72CA
73DD
74DC
75AA
76BB
77BB
78BB
79C
80CC
81CC
82CD
83DB
84DD
85AC
86BC
87BA
88DD
89BB
90DA
91DB
92C
93BB
94DC
95BA
96CC
97AD
98C
99A
100C
101B
102CD
103BA
104CC
105AA
106DC
107BB
108D
109DB
110C
111BA
112CC
113AB
114D
115D
116C
117CA
118CD
119DC
120AB
121CD
122CC
123CC
124CC
125CD
126BD
127AB
128DD
129BA
130D
131D
132CA
133BB
134C
135CB
136CC
137CA
138DD
139DD
140CB
141AA
142AA
143BB
144BB
145BD
146C
147BB
148A
149CA
150D
151AA
152AA
153B
154AB
155BB
156CC
157BA
158DC
159AC
160A
161DA
162Annulled
163DD
164C
165CA
166BB
167AC
168BD
169B
170AB
171BC
172A
173BA
174BB
175BB
176CC
177CC
178BA
179CD
180AA
181BB
182CC
183DB
184BB
185BB
186DAnnulled
187CC
188CD
189BD
190BA
191CB
192BA
193AC
194BA
195AA
196AA
197BB
198CC
199DD
200CC
201B
202A
203DD
204BC
205DB
206DD
207AA
208C
209CC
210B