MedicalBenchmark
AlfredPros: CodeLLaMa 7B Instruct Solidity provider

CodeLLaMa 7B Instruct Solidity

290

#290 of 291 modelsMIR 2024

Net score

2.33 pts

Accuracy

5.5%

Correct / Incorrect

11 / 26

Total Cost

$0.34

Overall Performance

(vs. average)
Accuracy

5.5%

avg: 80.5%

Net score

2.33 pts

avg: 150.85 pts

Correct

11

avg: 161

Incorrect

26

avg: 30

Total Cost

$0.34

avg: $3.32

Average response time

30.6s

avg: 16.4s

Output Tokens

205K

avg: 427K

Reasoning Tokens

0

avg: 310K

Average confidence

24.0%

avg: 95.4%

Subject Breakdown

Allergology
Correct
0
Incorrect
1
Unanswered
2
Accuracy
0.0%
Average
90.5%
Anesthesiology and Resuscitation
Correct
0
Incorrect
0
Unanswered
4
Accuracy
0.0%
Average
87.1%
Cardiology
Correct
0
Incorrect
2
Unanswered
19
Accuracy
0.0%
Average
79.7%
Dermatology
Correct
0
Incorrect
1
Unanswered
13
Accuracy
0.0%
Average
80.2%
Endocrinology and Nutrition
Correct
0
Incorrect
3
Unanswered
16
Accuracy
0.0%
Average
84.2%
ENT
Correct
0
Incorrect
3
Unanswered
4
Accuracy
0.0%
Average
74.4%
Epidemiology
Correct
1
Incorrect
1
Unanswered
6
Accuracy
12.5%
Average
89.3%
Gastroenterology
Correct
0
Incorrect
5
Unanswered
17
Accuracy
0.0%
Average
70.5%
Genetics
Correct
0
Incorrect
1
Unanswered
6
Accuracy
0.0%
Average
86.5%
Geriatrics
Correct
1
Incorrect
0
Unanswered
9
Accuracy
10.0%
Average
86.9%
Gynecology and Obstetrics
Correct
1
Incorrect
2
Unanswered
11
Accuracy
7.1%
Average
81.2%
Health Planning and Management
Correct
0
Incorrect
0
Unanswered
2
Accuracy
0.0%
Average
73.2%
Hematology
Correct
2
Incorrect
0
Unanswered
11
Accuracy
15.4%
Average
81.5%
Immunology
Correct
0
Incorrect
1
Unanswered
7
Accuracy
0.0%
Average
89.1%
Infectious Diseases
Correct
1
Incorrect
2
Unanswered
20
Accuracy
4.3%
Average
81.8%
Legal Medicine and Bioethics
Correct
0
Incorrect
0
Unanswered
2
Accuracy
0.0%
Average
91.7%
Medical Oncology
Correct
2
Incorrect
3
Unanswered
16
Accuracy
9.5%
Average
80.2%
Nephrology
Correct
1
Incorrect
0
Unanswered
12
Accuracy
7.7%
Average
80.8%
Neurology
Correct
3
Incorrect
3
Unanswered
16
Accuracy
13.6%
Average
83.7%
Ophthalmology
Correct
0
Incorrect
1
Unanswered
4
Accuracy
0.0%
Average
80.0%
Palliative Care
Correct
0
Incorrect
1
Unanswered
3
Accuracy
0.0%
Average
88.2%
Pediatrics
Correct
1
Incorrect
0
Unanswered
16
Accuracy
5.9%
Average
82.0%
Pharmacology
Correct
3
Incorrect
0
Unanswered
20
Accuracy
13.0%
Average
85.4%
Psychiatry
Correct
0
Incorrect
1
Unanswered
9
Accuracy
0.0%
Average
89.5%
Pulmonology
Correct
2
Incorrect
2
Unanswered
15
Accuracy
10.5%
Average
80.6%
Radiology-Emergency
Correct
0
Incorrect
2
Unanswered
12
Accuracy
0.0%
Average
64.9%
Rheumatology
Correct
0
Incorrect
1
Unanswered
13
Accuracy
0.0%
Average
81.4%
Statistics
Correct
1
Incorrect
0
Unanswered
2
Accuracy
33.3%
Average
91.1%
Traumatology
Correct
0
Incorrect
3
Unanswered
12
Accuracy
0.0%
Average
74.5%
Urology
Correct
1
Incorrect
1
Unanswered
4
Accuracy
16.7%
Average
78.2%

Question Type Breakdown

Anatomy
Correct
0
Incorrect
3
Unanswered
3
Accuracy
0.0%
Average
79.8%
Biostatistics
Correct
1
Incorrect
0
Unanswered
4
Accuracy
20.0%
Average
90.7%
Diagnosis
Correct
3
Incorrect
13
Unanswered
57
Accuracy
4.1%
Average
79.2%
Epidemiology
Correct
1
Incorrect
1
Unanswered
10
Accuracy
8.3%
Average
81.2%
Ethics
Correct
0
Incorrect
0
Unanswered
1
Accuracy
0.0%
Average
94.5%
Interpretation
Correct
3
Incorrect
5
Unanswered
29
Accuracy
8.1%
Average
69.6%
Pathophysiology
Correct
2
Incorrect
4
Unanswered
27
Accuracy
6.1%
Average
85.4%
Pharmacology
Correct
2
Incorrect
1
Unanswered
22
Accuracy
8.0%
Average
84.0%
Prevention
Correct
0
Incorrect
1
Unanswered
11
Accuracy
0.0%
Average
89.8%
Prognosis
Correct
1
Incorrect
0
Unanswered
6
Accuracy
14.3%
Average
83.9%
Risk
Correct
1
Incorrect
0
Unanswered
12
Accuracy
7.7%
Average
83.6%
Tests
Correct
2
Incorrect
3
Unanswered
16
Accuracy
9.5%
Average
73.9%
Treatment
Correct
2
Incorrect
10
Unanswered
59
Accuracy
2.8%
Average
81.3%
#AnswerCorrectStatus
1DB
2AD
3B
4C
5C
6B
7CD
8C
9A
10DD
11D
12A
13C
14A
15B
16AA
17C
18DA
19B
20CC
21D
22B
23A
24A
25C
26B
27AC
28A
29B
30AC
31D
32A
33C
34B
35D
36D
37A
38A
39AC
40B
41C
42D
43CA
44D
45D
46BB
47C
48C
49B
50C
51A
52DD
53C
54B
55C
56D
57CA
58CA
59A
60A
61BA
62D
63AD
64Annulled
65D
66C
67B
68Annulled
69A
70B
71DB
72D
73B
74C
75B
76A
77D
78C
79B
80A
81C
82C
83B
84C
85A
86AA
87DB
88D
89B
90A
91D
92A
93C
94B
95CD
96B
97B
98B
99A
100B
101DA
102D
103B
104D
105B
106CC
107C
108B
109D
110CD
111B
112C
113Annulled
114D
115D
116CA
117D
118D
119A
120C
121A
122B
123D
124DD
125B
126AD
127A
128B
129D
130C
131C
132D
133CA
134C
135A
136BD
137A
138C
139A
140C
141B
142C
143A
144DD
145C
146CC
147C
148A
149C
150D
151A
152A
153C
154B
155D
156C
157C
158D
159D
160B
161B
162B
163DB
164B
165A
166C
167DA
168B
169BC
170A
171D
172B
173A
174B
175A
176C
177C
178B
179C
180Annulled
181B
182D
183AC
184A
185C
186D
187A
188C
189D
190D
191B
192B
193C
194C
195C
196B
197AA
198B
199D
200A
201B
202D
203CB
204D
205D
206Annulled
207A
208A
209CB
210D