MedicalBenchmark
Sao10K: Llama 3.1 Euryale 70B v2.2 provider

Llama 3.1 Euryale 70B v2.2

232

#232 of 291 modelsMIR 2024

Net score

125.00 pts

Accuracy

70.0%

Correct / Incorrect

140 / 45

Total Cost

$0.18

Overall Performance

(vs. average)
Accuracy

70.0%

avg: 80.5%

Net score

125.00 pts

avg: 150.85 pts

Correct

140

avg: 161

Incorrect

45

avg: 30

Total Cost

$0.18

avg: $3.32

Average response time

21.1s

avg: 16.4s

Output Tokens

117K

avg: 427K

Reasoning Tokens

0

avg: 310K

Average confidence

92.7%

avg: 95.4%

Subject Breakdown

Allergology
Correct
2
Incorrect
1
Unanswered
0
Accuracy
66.7%
Average
90.5%
Anesthesiology and Resuscitation
Correct
4
Incorrect
0
Unanswered
0
Accuracy
100.0%
Average
87.1%
Cardiology
Correct
15
Incorrect
4
Unanswered
2
Accuracy
71.4%
Average
79.7%
Dermatology
Correct
10
Incorrect
3
Unanswered
1
Accuracy
71.4%
Average
80.2%
Endocrinology and Nutrition
Correct
16
Incorrect
3
Unanswered
0
Accuracy
84.2%
Average
84.2%
ENT
Correct
5
Incorrect
1
Unanswered
1
Accuracy
71.4%
Average
74.4%
Epidemiology
Correct
6
Incorrect
0
Unanswered
2
Accuracy
75.0%
Average
89.3%
Gastroenterology
Correct
12
Incorrect
6
Unanswered
4
Accuracy
54.5%
Average
70.5%
Genetics
Correct
6
Incorrect
1
Unanswered
0
Accuracy
85.7%
Average
86.5%
Geriatrics
Correct
8
Incorrect
2
Unanswered
0
Accuracy
80.0%
Average
86.9%
Gynecology and Obstetrics
Correct
9
Incorrect
4
Unanswered
1
Accuracy
64.3%
Average
81.2%
Health Planning and Management
Correct
1
Incorrect
0
Unanswered
1
Accuracy
50.0%
Average
73.2%
Hematology
Correct
7
Incorrect
6
Unanswered
0
Accuracy
53.8%
Average
81.5%
Immunology
Correct
8
Incorrect
0
Unanswered
0
Accuracy
100.0%
Average
89.1%
Infectious Diseases
Correct
16
Incorrect
6
Unanswered
1
Accuracy
69.6%
Average
81.8%
Legal Medicine and Bioethics
Correct
1
Incorrect
1
Unanswered
0
Accuracy
50.0%
Average
91.7%
Medical Oncology
Correct
15
Incorrect
3
Unanswered
3
Accuracy
71.4%
Average
80.2%
Nephrology
Correct
9
Incorrect
4
Unanswered
0
Accuracy
69.2%
Average
80.8%
Neurology
Correct
17
Incorrect
5
Unanswered
0
Accuracy
77.3%
Average
83.7%
Ophthalmology
Correct
5
Incorrect
0
Unanswered
0
Accuracy
100.0%
Average
80.0%
Palliative Care
Correct
3
Incorrect
1
Unanswered
0
Accuracy
75.0%
Average
88.2%
Pediatrics
Correct
11
Incorrect
4
Unanswered
2
Accuracy
64.7%
Average
82.0%
Pharmacology
Correct
16
Incorrect
6
Unanswered
1
Accuracy
69.6%
Average
85.4%
Psychiatry
Correct
10
Incorrect
0
Unanswered
0
Accuracy
100.0%
Average
89.5%
Pulmonology
Correct
12
Incorrect
5
Unanswered
2
Accuracy
63.2%
Average
80.6%
Radiology-Emergency
Correct
6
Incorrect
4
Unanswered
4
Accuracy
42.9%
Average
64.9%
Rheumatology
Correct
12
Incorrect
2
Unanswered
0
Accuracy
85.7%
Average
81.4%
Statistics
Correct
1
Incorrect
0
Unanswered
2
Accuracy
33.3%
Average
91.1%
Traumatology
Correct
9
Incorrect
3
Unanswered
3
Accuracy
60.0%
Average
74.5%
Urology
Correct
5
Incorrect
1
Unanswered
0
Accuracy
83.3%
Average
78.2%

Question Type Breakdown

Anatomy
Correct
4
Incorrect
1
Unanswered
1
Accuracy
66.7%
Average
79.8%
Biostatistics
Correct
2
Incorrect
0
Unanswered
3
Accuracy
40.0%
Average
90.7%
Diagnosis
Correct
54
Incorrect
13
Unanswered
6
Accuracy
74.0%
Average
79.2%
Epidemiology
Correct
10
Incorrect
1
Unanswered
1
Accuracy
83.3%
Average
81.2%
Ethics
Correct
0
Incorrect
1
Unanswered
0
Accuracy
0.0%
Average
94.5%
Interpretation
Correct
18
Incorrect
13
Unanswered
6
Accuracy
48.6%
Average
69.6%
Pathophysiology
Correct
27
Incorrect
6
Unanswered
0
Accuracy
81.8%
Average
85.4%
Pharmacology
Correct
18
Incorrect
7
Unanswered
0
Accuracy
72.0%
Average
84.0%
Prevention
Correct
9
Incorrect
1
Unanswered
2
Accuracy
75.0%
Average
89.8%
Prognosis
Correct
6
Incorrect
1
Unanswered
0
Accuracy
85.7%
Average
83.9%
Risk
Correct
11
Incorrect
2
Unanswered
0
Accuracy
84.6%
Average
83.6%
Tests
Correct
12
Incorrect
5
Unanswered
4
Accuracy
57.1%
Average
73.9%
Treatment
Correct
47
Incorrect
20
Unanswered
4
Accuracy
66.2%
Average
81.3%
#AnswerCorrectStatus
1BB
2BD
3DB
4CC
5C
6BB
7DD
8CC
9CA
10DD
11D
12AA
13DC
14BA
15B
16BA
17CC
18BA
19BB
20DC
21D
22B
23AA
24A
25CC
26BB
27CC
28DA
29AB
30C
31DD
32AA
33CC
34DB
35DD
36DD
37AA
38AA
39CC
40BB
41C
42BD
43AA
44D
45D
46BB
47CC
48CC
49BB
50AC
51AA
52CD
53CC
54BB
55CC
56DD
57AA
58AA
59AA
60AA
61CA
62DD
63DD
64DAnnulled
65DD
66CC
67CB
68BAnnulled
69AA
70BB
71BB
72CD
73CB
74CC
75B
76AA
77DD
78CC
79AB
80AA
81CC
82CC
83BB
84CC
85AA
86AA
87BB
88DD
89BB
90AA
91DD
92DA
93AC
94BB
95DD
96BB
97BB
98BB
99A
100BB
101AA
102DD
103BB
104DD
105CB
106CC
107CC
108BB
109DD
110DD
111BB
112CC
113DAnnulled
114DD
115AD
116BA
117DD
118DD
119AA
120CC
121AA
122BB
123CD
124AD
125CB
126DD
127AA
128DB
129DD
130CC
131AC
132D
133AA
134CC
135A
136DD
137AA
138CC
139AA
140CC
141BB
142CC
143BA
144DD
145AC
146AC
147CC
148AA
149AC
150DD
151AA
152AA
153DC
154BB
155BD
156CC
157CC
158DD
159DD
160BB
161BB
162BB
163BB
164DB
165AA
166CC
167AA
168BB
169CC
170CA
171AD
172BB
173CA
174BB
175AA
176CC
177AC
178DB
179CC
180BAnnulled
181CB
182BD
183CC
184AA
185CC
186DD
187AA
188CC
189DD
190DD
191BB
192B
193DC
194CC
195CC
196BB
197AA
198BB
199CD
200AA
201BB
202DD
203BB
204DD
205BD
206CAnnulled
207AA
208AA
209AB
210DD