MedicalBenchmark
AllenAI: Olmo 3 7B Think provider

Olmo 3 7B Think

282

#282 of 320 modelsMIR 2024

Net score

84.00 pts

Accuracy

56.0%

Correct / Incorrect

112 / 84

Total Cost

$0.14

Overall Performance

(vs. average)
Accuracy

56.0%

avg: 81.3%

Net score

84.00 pts

avg: 153.08 pts

Correct

112

avg: 163

Incorrect

84

avg: 29

Total Cost

$0.14

avg: $3.09

Average response time

16.3s

avg: 17.7s

Output Tokens

647K

avg: 414K

Reasoning Tokens

583K

avg: 296K

Average confidence

97.4%

avg: 95.7%

Subject Breakdown

Allergology
Correct
3
Incorrect
0
Unanswered
0
Accuracy
100.0%
Average
90.8%
Anesthesiology and Resuscitation
Correct
1
Incorrect
3
Unanswered
0
Accuracy
25.0%
Average
87.7%
Cardiology
Correct
11
Incorrect
9
Unanswered
1
Accuracy
52.4%
Average
80.4%
Dermatology
Correct
7
Incorrect
7
Unanswered
0
Accuracy
50.0%
Average
81.0%
Endocrinology and Nutrition
Correct
13
Incorrect
6
Unanswered
0
Accuracy
68.4%
Average
85.1%
ENT
Correct
5
Incorrect
2
Unanswered
0
Accuracy
71.4%
Average
75.1%
Epidemiology
Correct
6
Incorrect
1
Unanswered
1
Accuracy
75.0%
Average
89.7%
Gastroenterology
Correct
7
Incorrect
14
Unanswered
1
Accuracy
31.8%
Average
71.5%
Genetics
Correct
5
Incorrect
2
Unanswered
0
Accuracy
71.4%
Average
87.1%
Geriatrics
Correct
7
Incorrect
3
Unanswered
0
Accuracy
70.0%
Average
87.7%
Gynecology and Obstetrics
Correct
5
Incorrect
7
Unanswered
2
Accuracy
35.7%
Average
82.0%
Health Planning and Management
Correct
1
Incorrect
1
Unanswered
0
Accuracy
50.0%
Average
75.1%
Hematology
Correct
8
Incorrect
5
Unanswered
0
Accuracy
61.5%
Average
82.4%
Immunology
Correct
7
Incorrect
1
Unanswered
0
Accuracy
87.5%
Average
89.7%
Infectious Diseases
Correct
12
Incorrect
11
Unanswered
0
Accuracy
52.2%
Average
82.5%
Legal Medicine and Bioethics
Correct
2
Incorrect
0
Unanswered
0
Accuracy
100.0%
Average
91.8%
Medical Oncology
Correct
10
Incorrect
10
Unanswered
1
Accuracy
47.6%
Average
80.9%
Nephrology
Correct
6
Incorrect
7
Unanswered
0
Accuracy
46.2%
Average
81.8%
Neurology
Correct
14
Incorrect
7
Unanswered
1
Accuracy
63.6%
Average
84.5%
Ophthalmology
Correct
4
Incorrect
1
Unanswered
0
Accuracy
80.0%
Average
81.3%
Palliative Care
Correct
2
Incorrect
2
Unanswered
0
Accuracy
50.0%
Average
88.6%
Pediatrics
Correct
9
Incorrect
7
Unanswered
1
Accuracy
52.9%
Average
82.9%
Pharmacology
Correct
13
Incorrect
10
Unanswered
0
Accuracy
56.5%
Average
85.8%
Psychiatry
Correct
8
Incorrect
1
Unanswered
1
Accuracy
80.0%
Average
90.0%
Pulmonology
Correct
12
Incorrect
7
Unanswered
0
Accuracy
63.2%
Average
81.6%
Radiology-Emergency
Correct
8
Incorrect
6
Unanswered
0
Accuracy
57.1%
Average
66.0%
Rheumatology
Correct
7
Incorrect
7
Unanswered
0
Accuracy
50.0%
Average
82.4%
Statistics
Correct
2
Incorrect
1
Unanswered
0
Accuracy
66.7%
Average
91.6%
Traumatology
Correct
8
Incorrect
7
Unanswered
0
Accuracy
53.3%
Average
75.4%
Urology
Correct
4
Incorrect
2
Unanswered
0
Accuracy
66.7%
Average
79.0%

Question Type Breakdown

Anatomy
Correct
3
Incorrect
3
Unanswered
0
Accuracy
50.0%
Average
81.1%
Biostatistics
Correct
4
Incorrect
1
Unanswered
0
Accuracy
80.0%
Average
91.3%
Diagnosis
Correct
43
Incorrect
29
Unanswered
1
Accuracy
58.9%
Average
80.0%
Epidemiology
Correct
9
Incorrect
3
Unanswered
0
Accuracy
75.0%
Average
82.1%
Ethics
Correct
1
Incorrect
0
Unanswered
0
Accuracy
100.0%
Average
94.0%
Interpretation
Correct
18
Incorrect
17
Unanswered
2
Accuracy
48.6%
Average
70.5%
Pathophysiology
Correct
21
Incorrect
12
Unanswered
0
Accuracy
63.6%
Average
86.1%
Pharmacology
Correct
12
Incorrect
13
Unanswered
0
Accuracy
48.0%
Average
84.7%
Prevention
Correct
9
Incorrect
2
Unanswered
1
Accuracy
75.0%
Average
90.3%
Prognosis
Correct
3
Incorrect
4
Unanswered
0
Accuracy
42.9%
Average
84.6%
Risk
Correct
7
Incorrect
6
Unanswered
0
Accuracy
53.8%
Average
84.5%
Tests
Correct
9
Incorrect
10
Unanswered
2
Accuracy
42.9%
Average
75.0%
Treatment
Correct
40
Incorrect
31
Unanswered
0
Accuracy
56.3%
Average
82.1%
#AnswerCorrectStatus
1AB
2BD
3DB
4BC
5CC
6DB
7DD
8CC
9CA
10DD
11CD
12AA
13C
14CA
15CB
16DA
17AC
18AA
19BB
20CC
21DD
22AB
23AA
24DA
25DC
26BB
27AC
28CA
29BB
30DC
31AD
32CA
33CC
34BB
35DD
36CD
37AA
38AA
39CC
40BB
41BC
42DD
43AA
44DD
45AD
46BB
47CC
48CC
49CB
50BC
51AA
52DD
53AC
54CB
55CC
56DD
57AA
58AA
59AA
60AA
61AA
62CD
63DD
64BAnnulled
65CD
66AC
67CB
68BAnnulled
69AA
70BB
71BB
72BD
73CB
74BC
75BB
76AA
77AD
78CC
79B
80AA
81DC
82DC
83BB
84CC
85AA
86AA
87BB
88DD
89BB
90CA
91BD
92BA
93BC
94BB
95CD
96BB
97AB
98DB
99AA
100BB
101DA
102DD
103BB
104AD
105BB
106DC
107BC
108BB
109DD
110CD
111BB
112CC
113DAnnulled
114DD
115DD
116DA
117DD
118AD
119AA
120BC
121AA
122AB
123DD
124DD
125CB
126DD
127AA
128DB
129DD
130DC
131BC
132CD
133CA
134BC
135CA
136AD
137AA
138AC
139BA
140BC
141BB
142CC
143BA
144DD
145CC
146BC
147CC
148AA
149BC
150DD
151AA
152A
153DC
154BB
155CD
156CC
157CC
158DD
159DD
160DB
161BB
162BB
163BB
164BB
165AA
166DC
167CA
168BB
169CC
170CA
171DD
172BB
173BA
174BB
175AA
176CC
177CC
178BB
179BC
180BAnnulled
181CB
182DD
183BC
184CA
185CC
186DD
187AA
188AC
189AD
190DD
191B
192BB
193CC
194CC
195CC
196BB
197CA
198BB
199DD
200BA
201BB
202DD
203DB
204AD
205AD
206CAnnulled
207DA
208AA
209AB
210AD