MedicalBenchmark
Sao10K: Llama 3 8B Lunaris provider

Llama 3 8B Lunaris

269

#269 of 291 modelsMIR 2024

Net score

53.33 pts

Accuracy

44.0%

Correct / Incorrect

88 / 104

Total Cost

$0.01

Overall Performance

(vs. average)
Accuracy

44.0%

avg: 80.5%

Net score

53.33 pts

avg: 150.85 pts

Correct

88

avg: 161

Incorrect

104

avg: 30

Total Cost

$0.01

avg: $3.32

Average response time

4.5s

avg: 16.4s

Output Tokens

69K

avg: 427K

Reasoning Tokens

0

avg: 310K

Average confidence

92.8%

avg: 95.4%

Subject Breakdown

Allergology
Correct
3
Incorrect
0
Unanswered
0
Accuracy
100.0%
Average
90.5%
Anesthesiology and Resuscitation
Correct
2
Incorrect
2
Unanswered
0
Accuracy
50.0%
Average
87.1%
Cardiology
Correct
12
Incorrect
9
Unanswered
0
Accuracy
57.1%
Average
79.7%
Dermatology
Correct
8
Incorrect
5
Unanswered
1
Accuracy
57.1%
Average
80.2%
Endocrinology and Nutrition
Correct
7
Incorrect
11
Unanswered
1
Accuracy
36.8%
Average
84.2%
ENT
Correct
1
Incorrect
4
Unanswered
2
Accuracy
14.3%
Average
74.4%
Epidemiology
Correct
3
Incorrect
4
Unanswered
1
Accuracy
37.5%
Average
89.3%
Gastroenterology
Correct
9
Incorrect
11
Unanswered
2
Accuracy
40.9%
Average
70.5%
Genetics
Correct
4
Incorrect
3
Unanswered
0
Accuracy
57.1%
Average
86.5%
Geriatrics
Correct
3
Incorrect
6
Unanswered
1
Accuracy
30.0%
Average
86.9%
Gynecology and Obstetrics
Correct
4
Incorrect
10
Unanswered
0
Accuracy
28.6%
Average
81.2%
Health Planning and Management
Correct
1
Incorrect
1
Unanswered
0
Accuracy
50.0%
Average
73.2%
Hematology
Correct
5
Incorrect
7
Unanswered
1
Accuracy
38.5%
Average
81.5%
Immunology
Correct
4
Incorrect
4
Unanswered
0
Accuracy
50.0%
Average
89.1%
Infectious Diseases
Correct
10
Incorrect
12
Unanswered
1
Accuracy
43.5%
Average
81.8%
Legal Medicine and Bioethics
Correct
1
Incorrect
1
Unanswered
0
Accuracy
50.0%
Average
91.7%
Medical Oncology
Correct
12
Incorrect
7
Unanswered
2
Accuracy
57.1%
Average
80.2%
Nephrology
Correct
5
Incorrect
8
Unanswered
0
Accuracy
38.5%
Average
80.8%
Neurology
Correct
5
Incorrect
16
Unanswered
1
Accuracy
22.7%
Average
83.7%
Ophthalmology
Correct
2
Incorrect
3
Unanswered
0
Accuracy
40.0%
Average
80.0%
Palliative Care
Correct
2
Incorrect
2
Unanswered
0
Accuracy
50.0%
Average
88.2%
Pediatrics
Correct
9
Incorrect
8
Unanswered
0
Accuracy
52.9%
Average
82.0%
Pharmacology
Correct
13
Incorrect
8
Unanswered
2
Accuracy
56.5%
Average
85.4%
Psychiatry
Correct
7
Incorrect
3
Unanswered
0
Accuracy
70.0%
Average
89.5%
Pulmonology
Correct
10
Incorrect
8
Unanswered
1
Accuracy
52.6%
Average
80.6%
Radiology-Emergency
Correct
6
Incorrect
7
Unanswered
1
Accuracy
42.9%
Average
64.9%
Rheumatology
Correct
9
Incorrect
5
Unanswered
0
Accuracy
64.3%
Average
81.4%
Statistics
Correct
1
Incorrect
2
Unanswered
0
Accuracy
33.3%
Average
91.1%
Traumatology
Correct
6
Incorrect
9
Unanswered
0
Accuracy
40.0%
Average
74.5%
Urology
Correct
4
Incorrect
2
Unanswered
0
Accuracy
66.7%
Average
78.2%

Question Type Breakdown

Anatomy
Correct
2
Incorrect
4
Unanswered
0
Accuracy
33.3%
Average
79.8%
Biostatistics
Correct
2
Incorrect
2
Unanswered
1
Accuracy
40.0%
Average
90.7%
Diagnosis
Correct
32
Incorrect
39
Unanswered
2
Accuracy
43.8%
Average
79.2%
Epidemiology
Correct
3
Incorrect
7
Unanswered
2
Accuracy
25.0%
Average
81.2%
Ethics
Correct
0
Incorrect
1
Unanswered
0
Accuracy
0.0%
Average
94.5%
Interpretation
Correct
10
Incorrect
26
Unanswered
1
Accuracy
27.0%
Average
69.6%
Pathophysiology
Correct
11
Incorrect
20
Unanswered
2
Accuracy
33.3%
Average
85.4%
Pharmacology
Correct
13
Incorrect
11
Unanswered
1
Accuracy
52.0%
Average
84.0%
Prevention
Correct
10
Incorrect
2
Unanswered
0
Accuracy
83.3%
Average
89.8%
Prognosis
Correct
3
Incorrect
4
Unanswered
0
Accuracy
42.9%
Average
83.9%
Risk
Correct
6
Incorrect
6
Unanswered
1
Accuracy
46.2%
Average
83.6%
Tests
Correct
10
Incorrect
11
Unanswered
0
Accuracy
47.6%
Average
73.9%
Treatment
Correct
38
Incorrect
31
Unanswered
2
Accuracy
53.5%
Average
81.3%
#AnswerCorrectStatus
1AB
2AD
3BB
4AC
5BC
6DB
7D
8CC
9BA
10DD
11DD
12AA
13AC
14BA
15AB
16DA
17AC
18DA
19BB
20DC
21BD
22AB
23AA
24CA
25CC
26B
27AC
28DA
29AB
30AC
31BD
32CA
33AC
34CB
35DD
36BD
37CA
38BA
39CC
40BB
41C
42DD
43AA
44AD
45AD
46BB
47C
48CC
49BB
50CC
51DA
52DD
53AC
54BB
55C
56DD
57AA
58DA
59AA
60DA
61CA
62CD
63CD
64CAnnulled
65DD
66DC
67CB
68BAnnulled
69BA
70CB
71AB
72CD
73CB
74CC
75BB
76DA
77CD
78AC
79BB
80AA
81CC
82CC
83AB
84CC
85AA
86CA
87DB
88DD
89BB
90AA
91DD
92CA
93CC
94DB
95BD
96BB
97DB
98CB
99AA
100BB
101CA
102DD
103BB
104CD
105DB
106BC
107CC
108DB
109AD
110DD
111CB
112CC
113AAnnulled
114DD
115DD
116AA
117DD
118DD
119CA
120CC
121AA
122AB
123CD
124CD
125CB
126BD
127AA
128CB
129DD
130C
131CC
132CD
133CA
134BC
135BA
136DD
137DA
138CC
139AA
140BC
141BB
142AC
143BA
144BD
145AC
146BC
147CC
148AA
149AC
150DD
151AA
152DA
153AC
154AB
155BD
156CC
157DC
158AD
159DD
160AB
161BB
162AB
163BB
164AB
165CA
166CC
167AA
168CB
169CC
170AA
171CD
172BB
173AA
174BB
175AA
176BC
177CC
178BB
179AC
180DAnnulled
181B
182BD
183CC
184CA
185CC
186DD
187CA
188C
189AD
190DD
191BB
192BB
193DC
194DC
195DC
196BB
197CA
198BB
199CD
200AA
201AB
202DD
203BB
204DD
205AD
206Annulled
207AA
208BA
209CB
210DD