MedicalBenchmark
EleutherAI: Llemma 7b provider

Llemma 7b

316

#316 of 319 modelsMIR 2025

Net score

3.33 pts

Accuracy

8.5%

Correct / Incorrect

17 / 41

Total Cost

$0.64

Overall Performance

(vs. average)
Accuracy

8.5%

avg: 77.9%

Net score

3.33 pts

avg: 143.96 pts

Correct

17

avg: 156

Incorrect

41

avg: 35

Total Cost

$0.64

avg: $3.36

Average response time

65.0s

avg: 19.0s

Output Tokens

446K

avg: 430K

Reasoning Tokens

0

avg: 306K

Average confidence

29.0%

avg: 95.2%

Subject Breakdown

Allergology
Correct
0
Incorrect
1
Unanswered
3
Accuracy
0.0%
Average
87.9%
Anesthesiology and Resuscitation
Correct
2
Incorrect
1
Unanswered
3
Accuracy
33.3%
Average
82.3%
Cardiology
Correct
0
Incorrect
3
Unanswered
19
Accuracy
0.0%
Average
78.6%
Dermatology
Correct
2
Incorrect
4
Unanswered
6
Accuracy
16.7%
Average
69.4%
Endocrinology and Nutrition
Correct
1
Incorrect
3
Unanswered
12
Accuracy
6.3%
Average
83.5%
ENT
Correct
1
Incorrect
1
Unanswered
6
Accuracy
12.5%
Average
74.8%
Epidemiology
Correct
1
Incorrect
1
Unanswered
5
Accuracy
14.3%
Average
69.1%
Gastroenterology
Correct
1
Incorrect
4
Unanswered
16
Accuracy
4.8%
Average
74.1%
Genetics
Correct
0
Incorrect
1
Unanswered
5
Accuracy
0.0%
Average
69.5%
Geriatrics
Correct
0
Incorrect
3
Unanswered
8
Accuracy
0.0%
Average
77.5%
Gynecology and Obstetrics
Correct
1
Incorrect
4
Unanswered
14
Accuracy
5.3%
Average
86.7%
Health Planning and Management
Correct
0
Incorrect
0
Unanswered
2
Accuracy
0.0%
Average
82.6%
Hematology
Correct
2
Incorrect
3
Unanswered
6
Accuracy
18.2%
Average
82.7%
Immunology
Correct
2
Incorrect
1
Unanswered
6
Accuracy
22.2%
Average
83.3%
Infectious Diseases
Correct
5
Incorrect
4
Unanswered
18
Accuracy
18.5%
Average
74.9%
Legal Medicine and Bioethics
Correct
2
Incorrect
0
Unanswered
3
Accuracy
40.0%
Average
68.4%
Medical Oncology
Correct
4
Incorrect
6
Unanswered
15
Accuracy
16.0%
Average
87.2%
Nephrology
Correct
0
Incorrect
3
Unanswered
11
Accuracy
0.0%
Average
84.8%
Neurology
Correct
0
Incorrect
4
Unanswered
16
Accuracy
0.0%
Average
77.3%
Ophthalmology
Correct
0
Incorrect
3
Unanswered
2
Accuracy
0.0%
Average
74.2%
Palliative Care
Correct
1
Incorrect
0
Unanswered
3
Accuracy
25.0%
Average
78.6%
Pediatrics
Correct
2
Incorrect
3
Unanswered
21
Accuracy
7.7%
Average
71.9%
Pharmacology
Correct
2
Incorrect
5
Unanswered
10
Accuracy
11.8%
Average
74.1%
Psychiatry
Correct
1
Incorrect
1
Unanswered
6
Accuracy
12.5%
Average
83.0%
Pulmonology
Correct
0
Incorrect
4
Unanswered
10
Accuracy
0.0%
Average
80.4%
Radiology-Emergency
Correct
0
Incorrect
3
Unanswered
11
Accuracy
0.0%
Average
69.4%
Rheumatology
Correct
1
Incorrect
5
Unanswered
9
Accuracy
6.7%
Average
76.6%
Statistics
Correct
0
Incorrect
1
Unanswered
2
Accuracy
0.0%
Average
76.6%
Traumatology
Correct
2
Incorrect
4
Unanswered
12
Accuracy
11.1%
Average
79.3%
Urology
Correct
0
Incorrect
3
Unanswered
4
Accuracy
0.0%
Average
80.7%

Question Type Breakdown

Anatomy
Correct
0
Incorrect
1
Unanswered
6
Accuracy
0.0%
Average
78.6%
Biostatistics
Correct
0
Incorrect
1
Unanswered
3
Accuracy
0.0%
Average
79.8%
Diagnosis
Correct
7
Incorrect
18
Unanswered
63
Accuracy
8.0%
Average
79.9%
Epidemiology
Correct
1
Incorrect
0
Unanswered
4
Accuracy
20.0%
Average
76.7%
Ethics
Correct
0
Incorrect
0
Unanswered
3
Accuracy
0.0%
Average
74.1%
Interpretation
Correct
2
Incorrect
7
Unanswered
33
Accuracy
4.8%
Average
70.7%
Legal
Correct
2
Incorrect
0
Unanswered
2
Accuracy
50.0%
Average
64.6%
Pathophysiology
Correct
2
Incorrect
6
Unanswered
19
Accuracy
7.4%
Average
76.1%
Pharmacology
Correct
1
Incorrect
5
Unanswered
7
Accuracy
7.7%
Average
83.3%
Prevention
Correct
2
Incorrect
2
Unanswered
8
Accuracy
16.7%
Average
75.6%
Prognosis
Correct
1
Incorrect
1
Unanswered
5
Accuracy
14.3%
Average
80.8%
Risk
Correct
0
Incorrect
1
Unanswered
4
Accuracy
0.0%
Average
85.2%
Tests
Correct
1
Incorrect
5
Unanswered
21
Accuracy
3.7%
Average
77.9%
Treatment
Correct
6
Incorrect
19
Unanswered
56
Accuracy
7.4%
Average
77.3%
#AnswerCorrectStatus
1DB
2DA
3C
4B
5A
6C
7C
8AA
9A
10AD
11D
12D
13B
14D
15Annulled
16B
17B
18A
19DC
20A
21B
22D
23C
24D
25C
26Annulled
27C
28AAnnulled
29D
30DB
31D
32A
33D
34D
35BB
36D
37C
38C
39D
40A
41D
42C
43B
44D
45BD
46BA
47A
48AA
49D
50B
51C
52AB
53CD
54B
55AA
56Annulled
57C
58B
59AD
60AA
61DA
62D
63B
64D
65A
66A
67B
68B
69B
70A
71D
72AA
73D
74C
75A
76B
77AB
78B
79C
80AC
81C
82D
83B
84D
85C
86C
87A
88D
89B
90A
91B
92AC
93DB
94AC
95AA
96C
97D
98C
99A
100C
101B
102D
103CA
104C
105A
106C
107B
108D
109B
110DC
111DA
112DC
113B
114D
115D
116C
117A
118D
119C
120B
121D
122C
123C
124C
125D
126AD
127B
128DD
129A
130AD
131D
132A
133B
134C
135DB
136C
137A
138D
139D
140AB
141AA
142AA
143AB
144B
145BD
146C
147AB
148A
149A
150D
151A
152DA
153DB
154B
155B
156CC
157A
158C
159AC
160A
161AA
162CAnnulled
163D
164BC
165DA
166B
167C
168D
169B
170B
171BC
172AA
173A
174B
175B
176C
177C
178A
179D
180AA
181BB
182C
183AB
184B
185BB
186Annulled
187DC
188AD
189D
190A
191B
192BA
193C
194AA
195A
196A
197AB
198BC
199BD
200C
201B
202CA
203D
204C
205B
206D
207DA
208AC
209C
210AB