MedicalBenchmark
AlfredPros: CodeLLaMa 7B Instruct Solidity provider

CodeLLaMa 7B Instruct Solidity

318

#318 of 320 modelsMIR 2024

Net score

2.33 pts

Accuracy

5.5%

Correct / Incorrect

11 / 26

Total Cost

$0.35

Overall Performance

(vs. average)
Accuracy

5.5%

avg: 81.3%

Net score

2.33 pts

avg: 153.08 pts

Correct

11

avg: 163

Incorrect

26

avg: 29

Total Cost

$0.35

avg: $3.09

Average response time

31.1s

avg: 17.7s

Output Tokens

209K

avg: 414K

Reasoning Tokens

0

avg: 296K

Average confidence

24.0%

avg: 95.7%

Subject Breakdown

Allergology
Correct
0
Incorrect
1
Unanswered
2
Accuracy
0.0%
Average
90.8%
Anesthesiology and Resuscitation
Correct
0
Incorrect
0
Unanswered
4
Accuracy
0.0%
Average
87.7%
Cardiology
Correct
0
Incorrect
2
Unanswered
19
Accuracy
0.0%
Average
80.4%
Dermatology
Correct
0
Incorrect
1
Unanswered
13
Accuracy
0.0%
Average
81.0%
Endocrinology and Nutrition
Correct
0
Incorrect
3
Unanswered
16
Accuracy
0.0%
Average
85.1%
ENT
Correct
0
Incorrect
3
Unanswered
4
Accuracy
0.0%
Average
75.1%
Epidemiology
Correct
1
Incorrect
1
Unanswered
6
Accuracy
12.5%
Average
89.7%
Gastroenterology
Correct
0
Incorrect
5
Unanswered
17
Accuracy
0.0%
Average
71.5%
Genetics
Correct
0
Incorrect
1
Unanswered
6
Accuracy
0.0%
Average
87.1%
Geriatrics
Correct
1
Incorrect
0
Unanswered
9
Accuracy
10.0%
Average
87.7%
Gynecology and Obstetrics
Correct
1
Incorrect
2
Unanswered
11
Accuracy
7.1%
Average
82.0%
Health Planning and Management
Correct
0
Incorrect
0
Unanswered
2
Accuracy
0.0%
Average
75.1%
Hematology
Correct
2
Incorrect
0
Unanswered
11
Accuracy
15.4%
Average
82.4%
Immunology
Correct
0
Incorrect
1
Unanswered
7
Accuracy
0.0%
Average
89.7%
Infectious Diseases
Correct
1
Incorrect
2
Unanswered
20
Accuracy
4.3%
Average
82.5%
Legal Medicine and Bioethics
Correct
0
Incorrect
0
Unanswered
2
Accuracy
0.0%
Average
91.8%
Medical Oncology
Correct
2
Incorrect
3
Unanswered
16
Accuracy
9.5%
Average
80.9%
Nephrology
Correct
1
Incorrect
0
Unanswered
12
Accuracy
7.7%
Average
81.8%
Neurology
Correct
3
Incorrect
3
Unanswered
16
Accuracy
13.6%
Average
84.5%
Ophthalmology
Correct
0
Incorrect
1
Unanswered
4
Accuracy
0.0%
Average
81.3%
Palliative Care
Correct
0
Incorrect
1
Unanswered
3
Accuracy
0.0%
Average
88.6%
Pediatrics
Correct
1
Incorrect
0
Unanswered
16
Accuracy
5.9%
Average
82.9%
Pharmacology
Correct
3
Incorrect
0
Unanswered
20
Accuracy
13.0%
Average
85.8%
Psychiatry
Correct
0
Incorrect
1
Unanswered
9
Accuracy
0.0%
Average
90.0%
Pulmonology
Correct
2
Incorrect
2
Unanswered
15
Accuracy
10.5%
Average
81.6%
Radiology-Emergency
Correct
0
Incorrect
2
Unanswered
12
Accuracy
0.0%
Average
66.0%
Rheumatology
Correct
0
Incorrect
1
Unanswered
13
Accuracy
0.0%
Average
82.4%
Statistics
Correct
1
Incorrect
0
Unanswered
2
Accuracy
33.3%
Average
91.6%
Traumatology
Correct
0
Incorrect
3
Unanswered
12
Accuracy
0.0%
Average
75.4%
Urology
Correct
1
Incorrect
1
Unanswered
4
Accuracy
16.7%
Average
79.0%

Question Type Breakdown

Anatomy
Correct
0
Incorrect
3
Unanswered
3
Accuracy
0.0%
Average
81.1%
Biostatistics
Correct
1
Incorrect
0
Unanswered
4
Accuracy
20.0%
Average
91.3%
Diagnosis
Correct
3
Incorrect
13
Unanswered
57
Accuracy
4.1%
Average
80.0%
Epidemiology
Correct
1
Incorrect
1
Unanswered
10
Accuracy
8.3%
Average
82.1%
Ethics
Correct
0
Incorrect
0
Unanswered
1
Accuracy
0.0%
Average
94.0%
Interpretation
Correct
3
Incorrect
5
Unanswered
29
Accuracy
8.1%
Average
70.5%
Pathophysiology
Correct
2
Incorrect
4
Unanswered
27
Accuracy
6.1%
Average
86.1%
Pharmacology
Correct
2
Incorrect
1
Unanswered
22
Accuracy
8.0%
Average
84.7%
Prevention
Correct
0
Incorrect
1
Unanswered
11
Accuracy
0.0%
Average
90.3%
Prognosis
Correct
1
Incorrect
0
Unanswered
6
Accuracy
14.3%
Average
84.6%
Risk
Correct
1
Incorrect
0
Unanswered
12
Accuracy
7.7%
Average
84.5%
Tests
Correct
2
Incorrect
3
Unanswered
16
Accuracy
9.5%
Average
75.0%
Treatment
Correct
2
Incorrect
10
Unanswered
59
Accuracy
2.8%
Average
82.1%
#AnswerCorrectStatus
1DB
2AD
3B
4C
5C
6B
7CD
8C
9A
10DD
11D
12A
13C
14A
15B
16AA
17C
18DA
19B
20CC
21D
22B
23A
24A
25C
26B
27AC
28A
29B
30AC
31D
32A
33C
34B
35D
36D
37A
38A
39AC
40B
41C
42D
43CA
44D
45D
46BB
47C
48C
49B
50C
51A
52DD
53C
54B
55C
56D
57CA
58CA
59A
60A
61BA
62D
63AD
64Annulled
65D
66C
67B
68Annulled
69A
70B
71DB
72D
73B
74C
75B
76A
77D
78C
79B
80A
81C
82C
83B
84C
85A
86AA
87DB
88D
89B
90A
91D
92A
93C
94B
95CD
96B
97B
98B
99A
100B
101DA
102D
103B
104D
105B
106CC
107C
108B
109D
110CD
111B
112C
113Annulled
114D
115D
116CA
117D
118D
119A
120C
121A
122B
123D
124DD
125B
126AD
127A
128B
129D
130C
131C
132D
133CA
134C
135A
136BD
137A
138C
139A
140C
141B
142C
143A
144DD
145C
146CC
147C
148A
149C
150D
151A
152A
153C
154B
155D
156C
157C
158D
159D
160B
161B
162B
163DB
164B
165A
166C
167DA
168B
169BC
170A
171D
172B
173A
174B
175A
176C
177C
178B
179C
180Annulled
181B
182D
183AC
184A
185C
186D
187A
188C
189D
190D
191B
192B
193C
194C
195C
196B
197AA
198B
199D
200A
201B
202D
203CB
204D
205D
206Annulled
207A
208A
209CB
210D