MedicalBenchmark
Meta: Llama 3 8B Instruct provider

Llama 3 8B Instruct

260

#260 of 291 modelsMIR 2024

Net score

65.33 pts

Accuracy

46.5%

Correct / Incorrect

93 / 83

Total Cost

$0.01

Overall Performance

(vs. average)
Accuracy

46.5%

avg: 80.5%

Net score

65.33 pts

avg: 150.85 pts

Correct

93

avg: 161

Incorrect

83

avg: 30

Total Cost

$0.01

avg: $3.32

Average response time

13.1s

avg: 16.4s

Output Tokens

100K

avg: 427K

Reasoning Tokens

0

avg: 310K

Average confidence

84.2%

avg: 95.4%

Subject Breakdown

Allergology
Correct
2
Incorrect
0
Unanswered
1
Accuracy
66.7%
Average
90.5%
Anesthesiology and Resuscitation
Correct
2
Incorrect
1
Unanswered
1
Accuracy
50.0%
Average
87.1%
Cardiology
Correct
10
Incorrect
7
Unanswered
4
Accuracy
47.6%
Average
79.7%
Dermatology
Correct
3
Incorrect
7
Unanswered
4
Accuracy
21.4%
Average
80.2%
Endocrinology and Nutrition
Correct
8
Incorrect
7
Unanswered
4
Accuracy
42.1%
Average
84.2%
ENT
Correct
3
Incorrect
4
Unanswered
0
Accuracy
42.9%
Average
74.4%
Epidemiology
Correct
5
Incorrect
3
Unanswered
0
Accuracy
62.5%
Average
89.3%
Gastroenterology
Correct
9
Incorrect
10
Unanswered
3
Accuracy
40.9%
Average
70.5%
Genetics
Correct
5
Incorrect
0
Unanswered
2
Accuracy
71.4%
Average
86.5%
Geriatrics
Correct
6
Incorrect
3
Unanswered
1
Accuracy
60.0%
Average
86.9%
Gynecology and Obstetrics
Correct
9
Incorrect
5
Unanswered
0
Accuracy
64.3%
Average
81.2%
Health Planning and Management
Correct
1
Incorrect
0
Unanswered
1
Accuracy
50.0%
Average
73.2%
Hematology
Correct
7
Incorrect
4
Unanswered
2
Accuracy
53.8%
Average
81.5%
Immunology
Correct
4
Incorrect
3
Unanswered
1
Accuracy
50.0%
Average
89.1%
Infectious Diseases
Correct
7
Incorrect
11
Unanswered
5
Accuracy
30.4%
Average
81.8%
Legal Medicine and Bioethics
Correct
1
Incorrect
1
Unanswered
0
Accuracy
50.0%
Average
91.7%
Medical Oncology
Correct
12
Incorrect
8
Unanswered
1
Accuracy
57.1%
Average
80.2%
Nephrology
Correct
4
Incorrect
7
Unanswered
2
Accuracy
30.8%
Average
80.8%
Neurology
Correct
12
Incorrect
9
Unanswered
1
Accuracy
54.5%
Average
83.7%
Ophthalmology
Correct
2
Incorrect
3
Unanswered
0
Accuracy
40.0%
Average
80.0%
Palliative Care
Correct
1
Incorrect
3
Unanswered
0
Accuracy
25.0%
Average
88.2%
Pediatrics
Correct
8
Incorrect
7
Unanswered
2
Accuracy
47.1%
Average
82.0%
Pharmacology
Correct
12
Incorrect
6
Unanswered
5
Accuracy
52.2%
Average
85.4%
Psychiatry
Correct
8
Incorrect
1
Unanswered
1
Accuracy
80.0%
Average
89.5%
Pulmonology
Correct
11
Incorrect
7
Unanswered
1
Accuracy
57.9%
Average
80.6%
Radiology-Emergency
Correct
9
Incorrect
5
Unanswered
0
Accuracy
64.3%
Average
64.9%
Rheumatology
Correct
6
Incorrect
5
Unanswered
3
Accuracy
42.9%
Average
81.4%
Statistics
Correct
1
Incorrect
2
Unanswered
0
Accuracy
33.3%
Average
91.1%
Traumatology
Correct
4
Incorrect
10
Unanswered
1
Accuracy
26.7%
Average
74.5%
Urology
Correct
2
Incorrect
4
Unanswered
0
Accuracy
33.3%
Average
78.2%

Question Type Breakdown

Anatomy
Correct
2
Incorrect
4
Unanswered
0
Accuracy
33.3%
Average
79.8%
Biostatistics
Correct
2
Incorrect
2
Unanswered
1
Accuracy
40.0%
Average
90.7%
Diagnosis
Correct
39
Incorrect
29
Unanswered
5
Accuracy
53.4%
Average
79.2%
Epidemiology
Correct
6
Incorrect
5
Unanswered
1
Accuracy
50.0%
Average
81.2%
Ethics
Correct
0
Incorrect
1
Unanswered
0
Accuracy
0.0%
Average
94.5%
Interpretation
Correct
15
Incorrect
19
Unanswered
3
Accuracy
40.5%
Average
69.6%
Pathophysiology
Correct
12
Incorrect
13
Unanswered
8
Accuracy
36.4%
Average
85.4%
Pharmacology
Correct
10
Incorrect
8
Unanswered
7
Accuracy
40.0%
Average
84.0%
Prevention
Correct
6
Incorrect
4
Unanswered
2
Accuracy
50.0%
Average
89.8%
Prognosis
Correct
7
Incorrect
0
Unanswered
0
Accuracy
100.0%
Average
83.9%
Risk
Correct
8
Incorrect
2
Unanswered
3
Accuracy
61.5%
Average
83.6%
Tests
Correct
12
Incorrect
7
Unanswered
2
Accuracy
57.1%
Average
73.9%
Treatment
Correct
30
Incorrect
34
Unanswered
7
Accuracy
42.3%
Average
81.3%
#AnswerCorrectStatus
1BB
2DD
3BB
4C
5CC
6BB
7DD
8CC
9CA
10DD
11DD
12AA
13C
14DA
15DB
16A
17CC
18AA
19CB
20AC
21CD
22CB
23BA
24BA
25AC
26B
27AC
28A
29AB
30BC
31CD
32CA
33C
34BB
35DD
36BD
37CA
38A
39CC
40BB
41BC
42D
43DA
44AD
45AD
46BB
47CC
48CC
49BB
50C
51AA
52DD
53C
54BB
55CC
56DD
57DA
58CA
59BA
60AA
61BA
62CD
63CD
64BAnnulled
65DD
66CC
67CB
68CAnnulled
69AA
70BB
71BB
72CD
73DB
74CC
75BB
76BA
77BD
78BC
79CB
80AA
81CC
82CC
83BB
84CC
85A
86AA
87CB
88DD
89BB
90AA
91DD
92DA
93C
94BB
95CD
96BB
97AB
98CB
99AA
100DB
101DA
102DD
103BB
104CD
105DB
106CC
107C
108BB
109AD
110DD
111B
112BC
113AAnnulled
114DD
115D
116A
117DD
118DD
119CA
120CC
121AA
122BB
123CD
124DD
125CB
126AD
127AA
128CB
129DD
130C
131BC
132DD
133AA
134CC
135BA
136CD
137CA
138CC
139AA
140BC
141BB
142DC
143CA
144CD
145DC
146BC
147CC
148AA
149CC
150DD
151AA
152AA
153BC
154BB
155DD
156BC
157AC
158DD
159DD
160AB
161BB
162BB
163BB
164AB
165CA
166DC
167DA
168B
169AC
170CA
171D
172BB
173A
174BB
175AA
176BC
177AC
178BB
179CC
180BAnnulled
181BB
182BD
183BC
184BA
185CC
186DD
187BA
188C
189AD
190DD
191BB
192B
193DC
194DC
195AC
196BB
197CA
198BB
199CD
200BA
201B
202DD
203BB
204D
205BD
206CAnnulled
207AA
208CA
209BB
210CD