MedicalBenchmark
EleutherAI: Llemma 7b provider

Llemma 7b

320

#320 of 320 modelsMIR 2024

Net score

0.00 pts

Accuracy

3.5%

Correct / Incorrect

7 / 37

Total Cost

$0.65

Overall Performance

(vs. average)
Accuracy

3.5%

avg: 81.3%

Net score

0.00 pts

avg: 153.08 pts

Correct

7

avg: 163

Incorrect

37

avg: 29

Total Cost

$0.65

avg: $3.09

Average response time

66.2s

avg: 17.7s

Output Tokens

455K

avg: 414K

Reasoning Tokens

0

avg: 296K

Average confidence

22.9%

avg: 95.7%

Subject Breakdown

Allergology
Correct
0
Incorrect
1
Unanswered
2
Accuracy
0.0%
Average
90.8%
Anesthesiology and Resuscitation
Correct
0
Incorrect
0
Unanswered
4
Accuracy
0.0%
Average
87.7%
Cardiology
Correct
1
Incorrect
7
Unanswered
13
Accuracy
4.8%
Average
80.4%
Dermatology
Correct
1
Incorrect
2
Unanswered
11
Accuracy
7.1%
Average
81.0%
Endocrinology and Nutrition
Correct
1
Incorrect
2
Unanswered
16
Accuracy
5.3%
Average
85.1%
ENT
Correct
0
Incorrect
2
Unanswered
5
Accuracy
0.0%
Average
75.1%
Epidemiology
Correct
1
Incorrect
1
Unanswered
6
Accuracy
12.5%
Average
89.7%
Gastroenterology
Correct
1
Incorrect
4
Unanswered
17
Accuracy
4.5%
Average
71.5%
Genetics
Correct
0
Incorrect
1
Unanswered
6
Accuracy
0.0%
Average
87.1%
Geriatrics
Correct
0
Incorrect
0
Unanswered
10
Accuracy
0.0%
Average
87.7%
Gynecology and Obstetrics
Correct
0
Incorrect
2
Unanswered
12
Accuracy
0.0%
Average
82.0%
Health Planning and Management
Correct
0
Incorrect
1
Unanswered
1
Accuracy
0.0%
Average
75.1%
Hematology
Correct
0
Incorrect
2
Unanswered
11
Accuracy
0.0%
Average
82.4%
Immunology
Correct
0
Incorrect
1
Unanswered
7
Accuracy
0.0%
Average
89.7%
Infectious Diseases
Correct
0
Incorrect
6
Unanswered
17
Accuracy
0.0%
Average
82.5%
Legal Medicine and Bioethics
Correct
0
Incorrect
1
Unanswered
1
Accuracy
0.0%
Average
91.8%
Medical Oncology
Correct
1
Incorrect
5
Unanswered
15
Accuracy
4.8%
Average
80.9%
Nephrology
Correct
0
Incorrect
1
Unanswered
12
Accuracy
0.0%
Average
81.8%
Neurology
Correct
2
Incorrect
3
Unanswered
17
Accuracy
9.1%
Average
84.5%
Ophthalmology
Correct
2
Incorrect
0
Unanswered
3
Accuracy
40.0%
Average
81.3%
Palliative Care
Correct
0
Incorrect
1
Unanswered
3
Accuracy
0.0%
Average
88.6%
Pediatrics
Correct
0
Incorrect
4
Unanswered
13
Accuracy
0.0%
Average
82.9%
Pharmacology
Correct
1
Incorrect
3
Unanswered
19
Accuracy
4.3%
Average
85.8%
Psychiatry
Correct
0
Incorrect
2
Unanswered
8
Accuracy
0.0%
Average
90.0%
Pulmonology
Correct
1
Incorrect
2
Unanswered
16
Accuracy
5.3%
Average
81.6%
Radiology-Emergency
Correct
0
Incorrect
1
Unanswered
13
Accuracy
0.0%
Average
66.0%
Rheumatology
Correct
0
Incorrect
2
Unanswered
12
Accuracy
0.0%
Average
82.4%
Statistics
Correct
0
Incorrect
1
Unanswered
2
Accuracy
0.0%
Average
91.6%
Traumatology
Correct
0
Incorrect
5
Unanswered
10
Accuracy
0.0%
Average
75.4%
Urology
Correct
0
Incorrect
1
Unanswered
5
Accuracy
0.0%
Average
79.0%

Question Type Breakdown

Anatomy
Correct
0
Incorrect
1
Unanswered
5
Accuracy
0.0%
Average
81.1%
Biostatistics
Correct
0
Incorrect
1
Unanswered
4
Accuracy
0.0%
Average
91.3%
Diagnosis
Correct
3
Incorrect
9
Unanswered
61
Accuracy
4.1%
Average
80.0%
Epidemiology
Correct
1
Incorrect
2
Unanswered
9
Accuracy
8.3%
Average
82.1%
Ethics
Correct
0
Incorrect
1
Unanswered
0
Accuracy
0.0%
Average
94.0%
Interpretation
Correct
1
Incorrect
3
Unanswered
33
Accuracy
2.7%
Average
70.5%
Pathophysiology
Correct
0
Incorrect
9
Unanswered
24
Accuracy
0.0%
Average
86.1%
Pharmacology
Correct
1
Incorrect
2
Unanswered
22
Accuracy
4.0%
Average
84.7%
Prevention
Correct
1
Incorrect
2
Unanswered
9
Accuracy
8.3%
Average
90.3%
Prognosis
Correct
1
Incorrect
0
Unanswered
6
Accuracy
14.3%
Average
84.6%
Risk
Correct
0
Incorrect
1
Unanswered
12
Accuracy
0.0%
Average
84.5%
Tests
Correct
0
Incorrect
4
Unanswered
17
Accuracy
0.0%
Average
75.0%
Treatment
Correct
4
Incorrect
16
Unanswered
51
Accuracy
5.6%
Average
82.1%
#AnswerCorrectStatus
1B
2D
3B
4C
5C
6B
7D
8C
9A
10D
11D
12DA
13C
14A
15B
16A
17DC
18A
19B
20C
21D
22B
23AA
24A
25C
26CB
27C
28A
29DB
30C
31AD
32BA
33C
34B
35D
36D
37A
38A
39C
40B
41C
42D
43A
44D
45CD
46B
47C
48C
49B
50C
51AA
52BD
53C
54B
55AC
56D
57A
58AA
59A
60A
61A
62D
63AD
64Annulled
65D
66C
67B
68Annulled
69A
70B
71B
72BD
73DB
74C
75B
76A
77D
78C
79B
80A
81C
82C
83B
84DC
85A
86A
87B
88D
89CB
90A
91AD
92A
93BC
94B
95DD
96B
97B
98CB
99A
100CB
101A
102D
103B
104CD
105AB
106C
107C
108B
109D
110D
111B
112C
113AAnnulled
114D
115AD
116CA
117D
118D
119A
120C
121A
122DB
123D
124D
125B
126D
127AA
128B
129BD
130C
131C
132D
133CA
134AC
135CA
136D
137A
138C
139AA
140C
141B
142C
143A
144D
145C
146C
147AC
148A
149C
150D
151A
152A
153C
154B
155BD
156C
157C
158D
159D
160B
161B
162B
163AB
164DB
165A
166C
167A
168B
169C
170A
171BD
172AB
173A
174B
175AA
176C
177C
178B
179C
180Annulled
181B
182AD
183C
184A
185C
186D
187BA
188C
189CD
190D
191B
192B
193C
194C
195C
196B
197A
198AB
199D
200A
201B
202D
203B
204D
205D
206Annulled
207A
208A
209B
210AD