MedicalBenchmark
Meta: Llama 3 70B Instruct provider

Llama 3 70B Instruct

239

#239 of 320 modelsMIR 2024

Net score

144.66 pts

Accuracy

79.0%

Correct / Incorrect

158 / 40

Total Cost

$0.10

Overall Performance

(vs. average)
Accuracy

79.0%

avg: 81.3%

Net score

144.66 pts

avg: 153.08 pts

Correct

158

avg: 163

Incorrect

40

avg: 29

Total Cost

$0.10

avg: $3.09

Average response time

13.5s

avg: 17.7s

Output Tokens

72K

avg: 414K

Reasoning Tokens

0

avg: 296K

Average confidence

98.6%

avg: 95.7%

Subject Breakdown

Allergology
Correct
3
Incorrect
0
Unanswered
0
Accuracy
100.0%
Average
90.8%
Anesthesiology and Resuscitation
Correct
3
Incorrect
1
Unanswered
0
Accuracy
75.0%
Average
87.7%
Cardiology
Correct
19
Incorrect
2
Unanswered
0
Accuracy
90.5%
Average
80.4%
Dermatology
Correct
11
Incorrect
3
Unanswered
0
Accuracy
78.6%
Average
81.0%
Endocrinology and Nutrition
Correct
15
Incorrect
3
Unanswered
1
Accuracy
78.9%
Average
85.1%
ENT
Correct
6
Incorrect
1
Unanswered
0
Accuracy
85.7%
Average
75.1%
Epidemiology
Correct
8
Incorrect
0
Unanswered
0
Accuracy
100.0%
Average
89.7%
Gastroenterology
Correct
16
Incorrect
6
Unanswered
0
Accuracy
72.7%
Average
71.5%
Genetics
Correct
6
Incorrect
1
Unanswered
0
Accuracy
85.7%
Average
87.1%
Geriatrics
Correct
7
Incorrect
3
Unanswered
0
Accuracy
70.0%
Average
87.7%
Gynecology and Obstetrics
Correct
11
Incorrect
3
Unanswered
0
Accuracy
78.6%
Average
82.0%
Health Planning and Management
Correct
1
Incorrect
0
Unanswered
1
Accuracy
50.0%
Average
75.1%
Hematology
Correct
9
Incorrect
4
Unanswered
0
Accuracy
69.2%
Average
82.4%
Immunology
Correct
6
Incorrect
1
Unanswered
1
Accuracy
75.0%
Average
89.7%
Infectious Diseases
Correct
19
Incorrect
4
Unanswered
0
Accuracy
82.6%
Average
82.5%
Legal Medicine and Bioethics
Correct
2
Incorrect
0
Unanswered
0
Accuracy
100.0%
Average
91.8%
Medical Oncology
Correct
17
Incorrect
4
Unanswered
0
Accuracy
81.0%
Average
80.9%
Nephrology
Correct
9
Incorrect
4
Unanswered
0
Accuracy
69.2%
Average
81.8%
Neurology
Correct
20
Incorrect
2
Unanswered
0
Accuracy
90.9%
Average
84.5%
Ophthalmology
Correct
4
Incorrect
1
Unanswered
0
Accuracy
80.0%
Average
81.3%
Palliative Care
Correct
3
Incorrect
1
Unanswered
0
Accuracy
75.0%
Average
88.6%
Pediatrics
Correct
15
Incorrect
2
Unanswered
0
Accuracy
88.2%
Average
82.9%
Pharmacology
Correct
20
Incorrect
3
Unanswered
0
Accuracy
87.0%
Average
85.8%
Psychiatry
Correct
9
Incorrect
1
Unanswered
0
Accuracy
90.0%
Average
90.0%
Pulmonology
Correct
13
Incorrect
5
Unanswered
1
Accuracy
68.4%
Average
81.6%
Radiology-Emergency
Correct
9
Incorrect
5
Unanswered
0
Accuracy
64.3%
Average
66.0%
Rheumatology
Correct
10
Incorrect
4
Unanswered
0
Accuracy
71.4%
Average
82.4%
Statistics
Correct
3
Incorrect
0
Unanswered
0
Accuracy
100.0%
Average
91.6%
Traumatology
Correct
11
Incorrect
4
Unanswered
0
Accuracy
73.3%
Average
75.4%
Urology
Correct
5
Incorrect
1
Unanswered
0
Accuracy
83.3%
Average
79.0%

Question Type Breakdown

Anatomy
Correct
5
Incorrect
1
Unanswered
0
Accuracy
83.3%
Average
81.1%
Biostatistics
Correct
4
Incorrect
0
Unanswered
1
Accuracy
80.0%
Average
91.3%
Diagnosis
Correct
61
Incorrect
12
Unanswered
0
Accuracy
83.6%
Average
80.0%
Epidemiology
Correct
11
Incorrect
1
Unanswered
0
Accuracy
91.7%
Average
82.1%
Ethics
Correct
1
Incorrect
0
Unanswered
0
Accuracy
100.0%
Average
94.0%
Interpretation
Correct
25
Incorrect
12
Unanswered
0
Accuracy
67.6%
Average
70.5%
Pathophysiology
Correct
28
Incorrect
4
Unanswered
1
Accuracy
84.8%
Average
86.1%
Pharmacology
Correct
20
Incorrect
5
Unanswered
0
Accuracy
80.0%
Average
84.7%
Prevention
Correct
11
Incorrect
0
Unanswered
1
Accuracy
91.7%
Average
90.3%
Prognosis
Correct
6
Incorrect
1
Unanswered
0
Accuracy
85.7%
Average
84.6%
Risk
Correct
10
Incorrect
3
Unanswered
0
Accuracy
76.9%
Average
84.5%
Tests
Correct
13
Incorrect
8
Unanswered
0
Accuracy
61.9%
Average
75.0%
Treatment
Correct
53
Incorrect
18
Unanswered
0
Accuracy
74.6%
Average
82.1%
#AnswerCorrectStatus
1BB
2CD
3DB
4CC
5DC
6BB
7DD
8CC
9BA
10DD
11DD
12AA
13CC
14DA
15BB
16BA
17CC
18AA
19BB
20CC
21DD
22BB
23AA
24CA
25AC
26BB
27DC
28DA
29BB
30CC
31DD
32AA
33CC
34DB
35DD
36DD
37AA
38A
39CC
40BB
41CC
42DD
43AA
44DD
45DD
46BB
47CC
48CC
49BB
50CC
51AA
52DD
53CC
54CB
55CC
56DD
57DA
58AA
59CA
60AA
61AA
62DD
63DD
64BAnnulled
65DD
66CC
67AB
68CAnnulled
69AA
70BB
71BB
72DD
73CB
74CC
75BB
76AA
77DD
78CC
79BB
80AA
81CC
82AC
83BB
84CC
85AA
86AA
87BB
88DD
89BB
90AA
91DD
92AA
93CC
94BB
95DD
96BB
97BB
98DB
99AA
100BB
101AA
102DD
103BB
104DD
105BB
106CC
107CC
108BB
109AD
110DD
111BB
112CC
113BAnnulled
114DD
115DD
116AA
117DD
118DD
119CA
120CC
121AA
122BB
123CD
124AD
125AB
126DD
127AA
128BB
129DD
130CC
131AC
132CD
133CA
134CC
135AA
136DD
137AA
138CC
139AA
140CC
141BB
142CC
143BA
144DD
145DC
146BC
147CC
148AA
149AC
150DD
151AA
152AA
153DC
154BB
155DD
156CC
157CC
158DD
159DD
160CB
161BB
162AB
163BB
164DB
165AA
166CC
167DA
168BB
169DC
170AA
171DD
172BB
173DA
174BB
175AA
176CC
177CC
178BB
179CC
180BAnnulled
181AB
182DD
183CC
184CA
185CC
186DD
187AA
188CC
189BD
190DD
191BB
192B
193CC
194CC
195CC
196BB
197AA
198BB
199DD
200AA
201BB
202DD
203BB
204DD
205BD
206CAnnulled
207DA
208AA
209DB
210AD