MedicalBenchmark
EleutherAI: Llemma 7b provider

Llemma 7b

288

#288 of 290 modelsMIR 2025

Net score

3.33 pts

Accuracy

8.5%

Correct / Incorrect

17 / 41

Total Cost

$0.63

Overall Performance

(vs. average)
Accuracy

8.5%

avg: 75.9%

Net score

3.33 pts

avg: 138.99 pts

Correct

17

avg: 152

Incorrect

41

avg: 38

Total Cost

$0.63

avg: $3.59

Average response time

63.6s

avg: 18.1s

Output Tokens

437K

avg: 443K

Reasoning Tokens

0

avg: 320K

Average confidence

28.6%

avg: 94.7%

Subject Breakdown

Allergology
Correct
0
Incorrect
1
Unanswered
3
Accuracy
0.0%
Average
86.9%
Anesthesiology and Resuscitation
Correct
2
Incorrect
1
Unanswered
3
Accuracy
33.3%
Average
81.3%
Cardiology
Correct
0
Incorrect
3
Unanswered
19
Accuracy
0.0%
Average
77.4%
Dermatology
Correct
2
Incorrect
5
Unanswered
6
Accuracy
15.4%
Average
62.8%
Endocrinology and Nutrition
Correct
1
Incorrect
3
Unanswered
12
Accuracy
6.3%
Average
82.5%
ENT
Correct
1
Incorrect
1
Unanswered
6
Accuracy
12.5%
Average
73.8%
Epidemiology
Correct
1
Incorrect
1
Unanswered
5
Accuracy
14.3%
Average
67.1%
Gastroenterology
Correct
1
Incorrect
4
Unanswered
16
Accuracy
4.8%
Average
72.9%
Genetics
Correct
0
Incorrect
1
Unanswered
5
Accuracy
0.0%
Average
68.2%
Geriatrics
Correct
0
Incorrect
3
Unanswered
8
Accuracy
0.0%
Average
71.2%
Gynecology and Obstetrics
Correct
1
Incorrect
4
Unanswered
14
Accuracy
5.3%
Average
85.9%
Health Planning and Management
Correct
0
Incorrect
0
Unanswered
2
Accuracy
0.0%
Average
81.6%
Hematology
Correct
2
Incorrect
2
Unanswered
7
Accuracy
18.2%
Average
81.8%
Immunology
Correct
2
Incorrect
1
Unanswered
6
Accuracy
22.2%
Average
82.5%
Infectious Diseases
Correct
5
Incorrect
5
Unanswered
18
Accuracy
17.9%
Average
71.1%
Legal Medicine and Bioethics
Correct
2
Incorrect
0
Unanswered
3
Accuracy
40.0%
Average
67.2%
Medical Oncology
Correct
4
Incorrect
6
Unanswered
15
Accuracy
16.0%
Average
86.3%
Nephrology
Correct
0
Incorrect
3
Unanswered
12
Accuracy
0.0%
Average
78.2%
Neurology
Correct
0
Incorrect
4
Unanswered
16
Accuracy
0.0%
Average
76.2%
Ophthalmology
Correct
0
Incorrect
3
Unanswered
2
Accuracy
0.0%
Average
72.6%
Palliative Care
Correct
1
Incorrect
0
Unanswered
3
Accuracy
25.0%
Average
77.2%
Pediatrics
Correct
2
Incorrect
3
Unanswered
20
Accuracy
8.0%
Average
72.7%
Pharmacology
Correct
2
Incorrect
5
Unanswered
10
Accuracy
11.8%
Average
73.1%
Psychiatry
Correct
1
Incorrect
1
Unanswered
6
Accuracy
12.5%
Average
82.0%
Pulmonology
Correct
0
Incorrect
4
Unanswered
10
Accuracy
0.0%
Average
73.0%
Radiology-Emergency
Correct
0
Incorrect
3
Unanswered
11
Accuracy
0.0%
Average
67.9%
Rheumatology
Correct
1
Incorrect
4
Unanswered
9
Accuracy
7.1%
Average
74.6%
Statistics
Correct
0
Incorrect
1
Unanswered
2
Accuracy
0.0%
Average
74.9%
Traumatology
Correct
2
Incorrect
4
Unanswered
12
Accuracy
11.1%
Average
78.2%
Urology
Correct
0
Incorrect
3
Unanswered
4
Accuracy
0.0%
Average
79.5%

Question Type Breakdown

Anatomy
Correct
0
Incorrect
1
Unanswered
6
Accuracy
0.0%
Average
77.1%
Biostatistics
Correct
0
Incorrect
1
Unanswered
3
Accuracy
0.0%
Average
78.4%
Diagnosis
Correct
7
Incorrect
17
Unanswered
65
Accuracy
7.9%
Average
77.9%
Epidemiology
Correct
1
Incorrect
0
Unanswered
4
Accuracy
20.0%
Average
75.0%
Ethics
Correct
0
Incorrect
0
Unanswered
3
Accuracy
0.0%
Average
72.0%
Interpretation
Correct
2
Incorrect
7
Unanswered
33
Accuracy
4.8%
Average
69.3%
Legal
Correct
2
Incorrect
0
Unanswered
2
Accuracy
50.0%
Average
63.6%
Pathophysiology
Correct
2
Incorrect
6
Unanswered
19
Accuracy
7.4%
Average
72.6%
Pharmacology
Correct
1
Incorrect
5
Unanswered
7
Accuracy
7.7%
Average
82.4%
Prevention
Correct
2
Incorrect
2
Unanswered
8
Accuracy
16.7%
Average
74.5%
Prognosis
Correct
1
Incorrect
1
Unanswered
4
Accuracy
16.7%
Average
77.8%
Risk
Correct
0
Incorrect
1
Unanswered
4
Accuracy
0.0%
Average
84.3%
Tests
Correct
1
Incorrect
4
Unanswered
21
Accuracy
3.8%
Average
76.3%
Treatment
Correct
6
Incorrect
20
Unanswered
56
Accuracy
7.3%
Average
75.2%
#AnswerCorrectStatus
1DB
2DA
3C
4B
5A
6C
7C
8AA
9A
10AD
11D
12D
13B
14D
15
16B
17B
18A
19DC
20A
21B
22D
23C
24D
25C
26Annulled
27C
28Annulled
29D
30DB
31D
32A
33D
34D
35BB
36D
37C
38C
39D
40A
41D
42C
43B
44D
45BD
46BA
47A
48AA
49D
50B
51C
52AB
53CD
54B
55AA
56Annulled
57C
58B
59AD
60AA
61DA
62D
63B
64D
65A
66A
67B
68B
69B
70A
71D
72AA
73D
74C
75A
76B
77AB
78B
79C
80AC
81C
82D
83B
84D
85C
86C
87A
88D
89B
90A
91B
92AC
93DB
94AC
95AA
96C
97D
98C
99A
100C
101B
102D
103CA
104C
105A
106C
107B
108D
109B
110DC
111DA
112DC
113B
114D
115D
116C
117A
118D
119C
120B
121D
122C
123C
124C
125D
126AD
127B
128DD
129A
130AD
131D
132A
133B
134C
135DB
136C
137A
138D
139D
140AB
141AA
142AA
143AB
144B
145BD
146C
147AB
148A
149A
150A
151A
152DA
153DB
154B
155B
156CC
157A
158C
159AC
160A
161AA
162C
163D
164BC
165A
166B
167C
168D
169B
170B
171BC
172AA
173A
174B
175B
176C
177C
178A
179D
180AA
181BB
182C
183AB
184B
185BB
186Annulled
187DC
188AD
189D
190A
191B
192BA
193C
194AA
195A
196A
197AB
198BC
199BD
200C
201B
202CA
203D
204C
205B
206D
207DA
208AC
209C
210AB