MedicalBenchmark
AlfredPros: CodeLLaMa 7B Instruct Solidity provider

CodeLLaMa 7B Instruct Solidity

319

#319 of 319 modelsMIR 2025

Net score

0.00 pts

Accuracy

3.5%

Correct / Incorrect

7 / 22

Total Cost

$0.37

Overall Performance

(vs. average)
Accuracy

3.5%

avg: 77.9%

Net score

0.00 pts

avg: 143.96 pts

Correct

7

avg: 156

Incorrect

22

avg: 35

Total Cost

$0.37

avg: $3.36

Average response time

33.4s

avg: 19.0s

Output Tokens

226K

avg: 430K

Reasoning Tokens

0

avg: 306K

Average confidence

16.4%

avg: 95.2%

Subject Breakdown

Allergology
Correct
0
Incorrect
0
Unanswered
4
Accuracy
0.0%
Average
87.9%
Anesthesiology and Resuscitation
Correct
0
Incorrect
0
Unanswered
6
Accuracy
0.0%
Average
82.3%
Cardiology
Correct
0
Incorrect
4
Unanswered
18
Accuracy
0.0%
Average
78.6%
Dermatology
Correct
0
Incorrect
0
Unanswered
12
Accuracy
0.0%
Average
69.4%
Endocrinology and Nutrition
Correct
1
Incorrect
3
Unanswered
12
Accuracy
6.3%
Average
83.5%
ENT
Correct
0
Incorrect
1
Unanswered
7
Accuracy
0.0%
Average
74.8%
Epidemiology
Correct
1
Incorrect
0
Unanswered
6
Accuracy
14.3%
Average
69.1%
Gastroenterology
Correct
0
Incorrect
3
Unanswered
18
Accuracy
0.0%
Average
74.1%
Genetics
Correct
0
Incorrect
0
Unanswered
6
Accuracy
0.0%
Average
69.5%
Geriatrics
Correct
1
Incorrect
0
Unanswered
10
Accuracy
9.1%
Average
77.5%
Gynecology and Obstetrics
Correct
1
Incorrect
3
Unanswered
15
Accuracy
5.3%
Average
86.7%
Health Planning and Management
Correct
0
Incorrect
0
Unanswered
2
Accuracy
0.0%
Average
82.6%
Hematology
Correct
0
Incorrect
1
Unanswered
10
Accuracy
0.0%
Average
82.7%
Immunology
Correct
1
Incorrect
0
Unanswered
8
Accuracy
11.1%
Average
83.3%
Infectious Diseases
Correct
0
Incorrect
4
Unanswered
23
Accuracy
0.0%
Average
74.9%
Legal Medicine and Bioethics
Correct
1
Incorrect
0
Unanswered
4
Accuracy
20.0%
Average
68.4%
Medical Oncology
Correct
2
Incorrect
2
Unanswered
21
Accuracy
8.0%
Average
87.2%
Nephrology
Correct
1
Incorrect
4
Unanswered
9
Accuracy
7.1%
Average
84.8%
Neurology
Correct
0
Incorrect
2
Unanswered
18
Accuracy
0.0%
Average
77.3%
Ophthalmology
Correct
0
Incorrect
0
Unanswered
5
Accuracy
0.0%
Average
74.2%
Palliative Care
Correct
2
Incorrect
0
Unanswered
2
Accuracy
50.0%
Average
78.6%
Pediatrics
Correct
1
Incorrect
6
Unanswered
19
Accuracy
3.8%
Average
71.9%
Pharmacology
Correct
3
Incorrect
1
Unanswered
13
Accuracy
17.6%
Average
74.1%
Psychiatry
Correct
0
Incorrect
0
Unanswered
8
Accuracy
0.0%
Average
83.0%
Pulmonology
Correct
0
Incorrect
0
Unanswered
14
Accuracy
0.0%
Average
80.4%
Radiology-Emergency
Correct
0
Incorrect
2
Unanswered
12
Accuracy
0.0%
Average
69.4%
Rheumatology
Correct
0
Incorrect
2
Unanswered
13
Accuracy
0.0%
Average
76.6%
Statistics
Correct
1
Incorrect
0
Unanswered
2
Accuracy
33.3%
Average
76.6%
Traumatology
Correct
1
Incorrect
1
Unanswered
16
Accuracy
5.6%
Average
79.3%
Urology
Correct
0
Incorrect
0
Unanswered
7
Accuracy
0.0%
Average
80.7%

Question Type Breakdown

Anatomy
Correct
0
Incorrect
0
Unanswered
7
Accuracy
0.0%
Average
78.6%
Biostatistics
Correct
1
Incorrect
0
Unanswered
3
Accuracy
25.0%
Average
79.8%
Diagnosis
Correct
3
Incorrect
12
Unanswered
73
Accuracy
3.4%
Average
79.9%
Epidemiology
Correct
0
Incorrect
1
Unanswered
4
Accuracy
0.0%
Average
76.7%
Ethics
Correct
1
Incorrect
0
Unanswered
2
Accuracy
33.3%
Average
74.1%
Interpretation
Correct
2
Incorrect
8
Unanswered
32
Accuracy
4.8%
Average
70.7%
Legal
Correct
0
Incorrect
0
Unanswered
4
Accuracy
0.0%
Average
64.6%
Pathophysiology
Correct
0
Incorrect
4
Unanswered
23
Accuracy
0.0%
Average
76.1%
Pharmacology
Correct
2
Incorrect
1
Unanswered
10
Accuracy
15.4%
Average
83.3%
Prevention
Correct
0
Incorrect
0
Unanswered
12
Accuracy
0.0%
Average
75.6%
Prognosis
Correct
0
Incorrect
0
Unanswered
7
Accuracy
0.0%
Average
80.8%
Risk
Correct
0
Incorrect
1
Unanswered
4
Accuracy
0.0%
Average
85.2%
Tests
Correct
0
Incorrect
2
Unanswered
25
Accuracy
0.0%
Average
77.9%
Treatment
Correct
3
Incorrect
8
Unanswered
70
Accuracy
3.7%
Average
77.3%
#AnswerCorrectStatus
1B
2A
3C
4AB
5DA
6AC
7C
8AA
9A
10D
11D
12D
13DB
14D
15Annulled
16B
17B
18A
19C
20CA
21B
22D
23C
24D
25C
26Annulled
27DC
28Annulled
29D
30B
31D
32A
33D
34D
35B
36D
37C
38C
39D
40A
41D
42C
43B
44D
45DD
46A
47A
48A
49D
50B
51C
52B
53D
54B
55A
56Annulled
57C
58B
59D
60AA
61A
62BD
63B
64D
65A
66A
67AB
68AB
69B
70BA
71D
72A
73D
74C
75A
76B
77B
78B
79C
80C
81C
82D
83B
84D
85C
86C
87A
88BD
89B
90A
91B
92C
93B
94AC
95A
96C
97D
98AC
99A
100C
101DB
102AD
103A
104C
105A
106C
107B
108D
109B
110C
111A
112C
113B
114D
115D
116C
117A
118D
119DC
120B
121D
122C
123C
124C
125AD
126D
127B
128D
129A
130D
131D
132DA
133B
134AC
135B
136C
137A
138D
139D
140B
141AA
142A
143B
144B
145D
146C
147B
148A
149A
150D
151A
152A
153B
154DB
155B
156C
157AA
158C
159C
160A
161A
162Annulled
163D
164C
165A
166AB
167C
168D
169B
170B
171C
172A
173A
174B
175B
176C
177C
178A
179D
180A
181B
182CC
183B
184B
185B
186Annulled
187C
188D
189DD
190A
191B
192A
193C
194A
195A
196A
197B
198C
199D
200C
201DB
202A
203D
204C
205B
206D
207A
208C
209C
210B