MedicalBenchmark
OpenAI: GPT-3.5 Turbo Instruct provider

GPT-3.5 Turbo Instruct

258

#258 of 291 modelsMIR 2024

Net score

71.00 pts

Accuracy

51.0%

Correct / Incorrect

102 / 93

Total Cost

$0.30

Overall Performance

(vs. average)
Accuracy

51.0%

avg: 80.5%

Net score

71.00 pts

avg: 150.85 pts

Correct

102

avg: 161

Incorrect

93

avg: 30

Total Cost

$0.30

avg: $3.32

Average response time

3.6s

avg: 16.4s

Output Tokens

78K

avg: 427K

Reasoning Tokens

0

avg: 310K

Average confidence

97.8%

avg: 95.4%

Subject Breakdown

Allergology
Correct
2
Incorrect
1
Unanswered
0
Accuracy
66.7%
Average
90.5%
Anesthesiology and Resuscitation
Correct
4
Incorrect
0
Unanswered
0
Accuracy
100.0%
Average
87.1%
Cardiology
Correct
11
Incorrect
10
Unanswered
0
Accuracy
52.4%
Average
79.7%
Dermatology
Correct
9
Incorrect
4
Unanswered
1
Accuracy
64.3%
Average
80.2%
Endocrinology and Nutrition
Correct
9
Incorrect
9
Unanswered
1
Accuracy
47.4%
Average
84.2%
ENT
Correct
5
Incorrect
2
Unanswered
0
Accuracy
71.4%
Average
74.4%
Epidemiology
Correct
5
Incorrect
2
Unanswered
1
Accuracy
62.5%
Average
89.3%
Gastroenterology
Correct
9
Incorrect
12
Unanswered
1
Accuracy
40.9%
Average
70.5%
Genetics
Correct
4
Incorrect
2
Unanswered
1
Accuracy
57.1%
Average
86.5%
Geriatrics
Correct
4
Incorrect
6
Unanswered
0
Accuracy
40.0%
Average
86.9%
Gynecology and Obstetrics
Correct
8
Incorrect
6
Unanswered
0
Accuracy
57.1%
Average
81.2%
Health Planning and Management
Correct
1
Incorrect
1
Unanswered
0
Accuracy
50.0%
Average
73.2%
Hematology
Correct
6
Incorrect
7
Unanswered
0
Accuracy
46.2%
Average
81.5%
Immunology
Correct
4
Incorrect
3
Unanswered
1
Accuracy
50.0%
Average
89.1%
Infectious Diseases
Correct
14
Incorrect
9
Unanswered
0
Accuracy
60.9%
Average
81.8%
Legal Medicine and Bioethics
Correct
2
Incorrect
0
Unanswered
0
Accuracy
100.0%
Average
91.7%
Medical Oncology
Correct
14
Incorrect
7
Unanswered
0
Accuracy
66.7%
Average
80.2%
Nephrology
Correct
4
Incorrect
8
Unanswered
1
Accuracy
30.8%
Average
80.8%
Neurology
Correct
10
Incorrect
12
Unanswered
0
Accuracy
45.5%
Average
83.7%
Ophthalmology
Correct
2
Incorrect
3
Unanswered
0
Accuracy
40.0%
Average
80.0%
Palliative Care
Correct
2
Incorrect
2
Unanswered
0
Accuracy
50.0%
Average
88.2%
Pediatrics
Correct
10
Incorrect
7
Unanswered
0
Accuracy
58.8%
Average
82.0%
Pharmacology
Correct
17
Incorrect
6
Unanswered
0
Accuracy
73.9%
Average
85.4%
Psychiatry
Correct
8
Incorrect
2
Unanswered
0
Accuracy
80.0%
Average
89.5%
Pulmonology
Correct
11
Incorrect
7
Unanswered
1
Accuracy
57.9%
Average
80.6%
Radiology-Emergency
Correct
7
Incorrect
7
Unanswered
0
Accuracy
50.0%
Average
64.9%
Rheumatology
Correct
5
Incorrect
8
Unanswered
1
Accuracy
35.7%
Average
81.4%
Statistics
Correct
2
Incorrect
0
Unanswered
1
Accuracy
66.7%
Average
91.1%
Traumatology
Correct
4
Incorrect
11
Unanswered
0
Accuracy
26.7%
Average
74.5%
Urology
Correct
2
Incorrect
4
Unanswered
0
Accuracy
33.3%
Average
78.2%

Question Type Breakdown

Anatomy
Correct
2
Incorrect
4
Unanswered
0
Accuracy
33.3%
Average
79.8%
Biostatistics
Correct
3
Incorrect
1
Unanswered
1
Accuracy
60.0%
Average
90.7%
Diagnosis
Correct
34
Incorrect
37
Unanswered
2
Accuracy
46.6%
Average
79.2%
Epidemiology
Correct
7
Incorrect
5
Unanswered
0
Accuracy
58.3%
Average
81.2%
Ethics
Correct
1
Incorrect
0
Unanswered
0
Accuracy
100.0%
Average
94.5%
Interpretation
Correct
15
Incorrect
22
Unanswered
0
Accuracy
40.5%
Average
69.6%
Pathophysiology
Correct
17
Incorrect
13
Unanswered
3
Accuracy
51.5%
Average
85.4%
Pharmacology
Correct
16
Incorrect
9
Unanswered
0
Accuracy
64.0%
Average
84.0%
Prevention
Correct
9
Incorrect
3
Unanswered
0
Accuracy
75.0%
Average
89.8%
Prognosis
Correct
4
Incorrect
3
Unanswered
0
Accuracy
57.1%
Average
83.9%
Risk
Correct
8
Incorrect
5
Unanswered
0
Accuracy
61.5%
Average
83.6%
Tests
Correct
10
Incorrect
9
Unanswered
2
Accuracy
47.6%
Average
73.9%
Treatment
Correct
35
Incorrect
36
Unanswered
0
Accuracy
49.3%
Average
81.3%
#AnswerCorrectStatus
1BB
2CD
3DB
4CC
5CC
6DB
7DD
8CC
9BA
10DD
11CD
12BA
13CC
14BA
15DB
16DA
17CC
18BA
19BB
20CC
21BD
22DB
23AA
24CA
25AC
26BB
27AC
28DA
29AB
30DC
31DD
32BA
33CC
34AB
35DD
36BD
37DA
38A
39CC
40BB
41CC
42AD
43DA
44DD
45DD
46B
47CC
48CC
49BB
50BC
51AA
52BD
53CC
54BB
55CC
56BD
57CA
58DA
59AA
60CA
61AA
62DD
63DD
64BAnnulled
65DD
66CC
67DB
68BAnnulled
69AA
70BB
71AB
72DD
73DB
74CC
75BB
76DA
77DD
78DC
79DB
80AA
81CC
82CC
83DB
84CC
85AA
86AA
87DB
88DD
89BB
90AA
91DD
92CA
93CC
94DB
95CD
96BB
97DB
98DB
99BA
100BB
101AA
102BD
103DB
104CD
105BB
106BC
107CC
108DB
109DD
110AD
111BB
112CC
113BAnnulled
114BD
115DD
116AA
117DD
118DD
119CA
120DC
121AA
122AB
123AD
124DD
125BB
126DD
127CA
128B
129DD
130CC
131CC
132DD
133CA
134CC
135DA
136DD
137DA
138AC
139AA
140CC
141B
142BC
143AA
144AD
145AC
146BC
147CC
148BA
149BC
150DD
151AA
152AA
153CC
154AB
155CD
156CC
157DC
158BD
159DD
160AB
161BB
162DB
163BB
164DB
165CA
166AC
167CA
168BB
169CC
170CA
171CD
172BB
173BA
174DB
175AA
176DC
177BC
178BB
179CC
180DAnnulled
181DB
182DD
183AC
184CA
185CC
186DD
187AA
188CC
189CD
190DD
191BB
192DB
193DC
194AC
195AC
196BB
197DA
198BB
199CD
200AA
201BB
202DD
203CB
204D
205DD
206BAnnulled
207AA
208DA
209DB
210AD