MedicalBenchmark
EleutherAI: Llemma 7b provider

Llemma 7b

291

#291 of 291 modelsMIR 2024

Net score

0.00 pts

Accuracy

3.5%

Correct / Incorrect

7 / 35

Total Cost

$0.63

Overall Performance

(vs. average)
Accuracy

3.5%

avg: 80.5%

Net score

0.00 pts

avg: 150.85 pts

Correct

7

avg: 161

Incorrect

35

avg: 30

Total Cost

$0.63

avg: $3.32

Average response time

66.5s

avg: 16.4s

Output Tokens

444K

avg: 427K

Reasoning Tokens

0

avg: 310K

Average confidence

22.7%

avg: 95.4%

Subject Breakdown

Allergology
Correct
0
Incorrect
1
Unanswered
2
Accuracy
0.0%
Average
90.5%
Anesthesiology and Resuscitation
Correct
0
Incorrect
0
Unanswered
4
Accuracy
0.0%
Average
87.1%
Cardiology
Correct
1
Incorrect
6
Unanswered
14
Accuracy
4.8%
Average
79.7%
Dermatology
Correct
1
Incorrect
1
Unanswered
12
Accuracy
7.1%
Average
80.2%
Endocrinology and Nutrition
Correct
1
Incorrect
2
Unanswered
16
Accuracy
5.3%
Average
84.2%
ENT
Correct
0
Incorrect
2
Unanswered
5
Accuracy
0.0%
Average
74.4%
Epidemiology
Correct
1
Incorrect
1
Unanswered
6
Accuracy
12.5%
Average
89.3%
Gastroenterology
Correct
1
Incorrect
4
Unanswered
17
Accuracy
4.5%
Average
70.5%
Genetics
Correct
0
Incorrect
1
Unanswered
6
Accuracy
0.0%
Average
86.5%
Geriatrics
Correct
0
Incorrect
0
Unanswered
10
Accuracy
0.0%
Average
86.9%
Gynecology and Obstetrics
Correct
0
Incorrect
2
Unanswered
12
Accuracy
0.0%
Average
81.2%
Health Planning and Management
Correct
0
Incorrect
1
Unanswered
1
Accuracy
0.0%
Average
73.2%
Hematology
Correct
0
Incorrect
2
Unanswered
11
Accuracy
0.0%
Average
81.5%
Immunology
Correct
0
Incorrect
1
Unanswered
7
Accuracy
0.0%
Average
89.1%
Infectious Diseases
Correct
0
Incorrect
5
Unanswered
18
Accuracy
0.0%
Average
81.8%
Legal Medicine and Bioethics
Correct
0
Incorrect
1
Unanswered
1
Accuracy
0.0%
Average
91.7%
Medical Oncology
Correct
1
Incorrect
5
Unanswered
15
Accuracy
4.8%
Average
80.2%
Nephrology
Correct
0
Incorrect
1
Unanswered
12
Accuracy
0.0%
Average
80.8%
Neurology
Correct
2
Incorrect
3
Unanswered
17
Accuracy
9.1%
Average
83.7%
Ophthalmology
Correct
2
Incorrect
0
Unanswered
3
Accuracy
40.0%
Average
80.0%
Palliative Care
Correct
0
Incorrect
1
Unanswered
3
Accuracy
0.0%
Average
88.2%
Pediatrics
Correct
0
Incorrect
4
Unanswered
13
Accuracy
0.0%
Average
82.0%
Pharmacology
Correct
1
Incorrect
3
Unanswered
19
Accuracy
4.3%
Average
85.4%
Psychiatry
Correct
0
Incorrect
2
Unanswered
8
Accuracy
0.0%
Average
89.5%
Pulmonology
Correct
1
Incorrect
1
Unanswered
17
Accuracy
5.3%
Average
80.6%
Radiology-Emergency
Correct
0
Incorrect
1
Unanswered
13
Accuracy
0.0%
Average
64.9%
Rheumatology
Correct
0
Incorrect
2
Unanswered
12
Accuracy
0.0%
Average
81.4%
Statistics
Correct
0
Incorrect
1
Unanswered
2
Accuracy
0.0%
Average
91.1%
Traumatology
Correct
0
Incorrect
5
Unanswered
10
Accuracy
0.0%
Average
74.5%
Urology
Correct
0
Incorrect
1
Unanswered
5
Accuracy
0.0%
Average
78.2%

Question Type Breakdown

Anatomy
Correct
0
Incorrect
1
Unanswered
5
Accuracy
0.0%
Average
79.8%
Biostatistics
Correct
0
Incorrect
1
Unanswered
4
Accuracy
0.0%
Average
90.7%
Diagnosis
Correct
3
Incorrect
8
Unanswered
62
Accuracy
4.1%
Average
79.2%
Epidemiology
Correct
1
Incorrect
2
Unanswered
9
Accuracy
8.3%
Average
81.2%
Ethics
Correct
0
Incorrect
1
Unanswered
0
Accuracy
0.0%
Average
94.5%
Interpretation
Correct
1
Incorrect
3
Unanswered
33
Accuracy
2.7%
Average
69.6%
Pathophysiology
Correct
0
Incorrect
9
Unanswered
24
Accuracy
0.0%
Average
85.4%
Pharmacology
Correct
1
Incorrect
2
Unanswered
22
Accuracy
4.0%
Average
84.0%
Prevention
Correct
1
Incorrect
2
Unanswered
9
Accuracy
8.3%
Average
89.8%
Prognosis
Correct
1
Incorrect
0
Unanswered
6
Accuracy
14.3%
Average
83.9%
Risk
Correct
0
Incorrect
1
Unanswered
12
Accuracy
0.0%
Average
83.6%
Tests
Correct
0
Incorrect
4
Unanswered
17
Accuracy
0.0%
Average
73.9%
Treatment
Correct
4
Incorrect
14
Unanswered
53
Accuracy
5.6%
Average
81.3%
#AnswerCorrectStatus
1B
2D
3B
4C
5C
6B
7D
8C
9A
10D
11D
12DA
13C
14A
15B
16A
17DC
18A
19B
20C
21D
22B
23AA
24A
25C
26CB
27C
28A
29DB
30C
31AD
32BA
33C
34B
35D
36D
37A
38A
39C
40B
41C
42D
43A
44D
45CD
46B
47C
48C
49B
50C
51AA
52BD
53C
54B
55AC
56D
57A
58AA
59A
60A
61A
62D
63AD
64Annulled
65D
66C
67B
68Annulled
69A
70B
71B
72BD
73DB
74C
75B
76A
77D
78C
79B
80A
81C
82C
83B
84DC
85A
86A
87B
88D
89CB
90A
91AD
92A
93BC
94B
95DD
96B
97B
98CB
99A
100CB
101A
102D
103B
104CD
105AB
106C
107C
108B
109D
110D
111B
112C
113AAnnulled
114D
115AD
116CA
117D
118D
119A
120C
121A
122B
123D
124D
125B
126D
127AA
128B
129BD
130C
131C
132D
133CA
134AC
135CA
136D
137A
138C
139AA
140C
141B
142C
143A
144D
145C
146C
147AC
148A
149C
150D
151A
152A
153C
154B
155BD
156C
157C
158D
159D
160B
161B
162B
163AB
164DB
165A
166C
167A
168B
169C
170A
171D
172AB
173A
174B
175AA
176C
177C
178B
179C
180Annulled
181B
182AD
183C
184A
185C
186D
187BA
188C
189CD
190D
191B
192B
193C
194C
195C
196B
197A
198AB
199D
200A
201B
202D
203B
204D
205D
206Annulled
207A
208A
209B
210AD