MedicalBenchmark
Goliath 120B provider

Goliath 120B

284

#284 of 319 modelsMIR 2025

Net score

71.00 pts

Accuracy

49.5%

Correct / Incorrect

99 / 84

Total Cost

$1.26

Overall Performance

(vs. average)
Accuracy

49.5%

avg: 77.9%

Net score

71.00 pts

avg: 143.96 pts

Correct

99

avg: 156

Incorrect

84

avg: 35

Total Cost

$1.26

avg: $3.36

Average response time

24.3s

avg: 19.0s

Output Tokens

105K

avg: 430K

Reasoning Tokens

0

avg: 306K

Average confidence

88.6%

avg: 95.2%

Subject Breakdown

Allergology
Correct
1
Incorrect
2
Unanswered
1
Accuracy
25.0%
Average
87.9%
Anesthesiology and Resuscitation
Correct
3
Incorrect
3
Unanswered
0
Accuracy
50.0%
Average
82.3%
Cardiology
Correct
10
Incorrect
10
Unanswered
2
Accuracy
45.5%
Average
78.6%
Dermatology
Correct
7
Incorrect
4
Unanswered
1
Accuracy
58.3%
Average
69.4%
Endocrinology and Nutrition
Correct
10
Incorrect
4
Unanswered
2
Accuracy
62.5%
Average
83.5%
ENT
Correct
4
Incorrect
3
Unanswered
1
Accuracy
50.0%
Average
74.8%
Epidemiology
Correct
2
Incorrect
5
Unanswered
0
Accuracy
28.6%
Average
69.1%
Gastroenterology
Correct
9
Incorrect
9
Unanswered
3
Accuracy
42.9%
Average
74.1%
Genetics
Correct
2
Incorrect
4
Unanswered
0
Accuracy
33.3%
Average
69.5%
Geriatrics
Correct
4
Incorrect
6
Unanswered
1
Accuracy
36.4%
Average
77.5%
Gynecology and Obstetrics
Correct
15
Incorrect
3
Unanswered
1
Accuracy
78.9%
Average
86.7%
Health Planning and Management
Correct
0
Incorrect
2
Unanswered
0
Accuracy
0.0%
Average
82.6%
Hematology
Correct
4
Incorrect
4
Unanswered
3
Accuracy
36.4%
Average
82.7%
Immunology
Correct
4
Incorrect
5
Unanswered
0
Accuracy
44.4%
Average
83.3%
Infectious Diseases
Correct
13
Incorrect
11
Unanswered
3
Accuracy
48.1%
Average
74.9%
Legal Medicine and Bioethics
Correct
2
Incorrect
3
Unanswered
0
Accuracy
40.0%
Average
68.4%
Medical Oncology
Correct
15
Incorrect
6
Unanswered
4
Accuracy
60.0%
Average
87.2%
Nephrology
Correct
8
Incorrect
5
Unanswered
1
Accuracy
57.1%
Average
84.8%
Neurology
Correct
9
Incorrect
10
Unanswered
1
Accuracy
45.0%
Average
77.3%
Ophthalmology
Correct
2
Incorrect
1
Unanswered
2
Accuracy
40.0%
Average
74.2%
Palliative Care
Correct
3
Incorrect
1
Unanswered
0
Accuracy
75.0%
Average
78.6%
Pediatrics
Correct
13
Incorrect
11
Unanswered
2
Accuracy
50.0%
Average
71.9%
Pharmacology
Correct
11
Incorrect
6
Unanswered
0
Accuracy
64.7%
Average
74.1%
Psychiatry
Correct
4
Incorrect
4
Unanswered
0
Accuracy
50.0%
Average
83.0%
Pulmonology
Correct
9
Incorrect
4
Unanswered
1
Accuracy
64.3%
Average
80.4%
Radiology-Emergency
Correct
4
Incorrect
9
Unanswered
1
Accuracy
28.6%
Average
69.4%
Rheumatology
Correct
6
Incorrect
9
Unanswered
0
Accuracy
40.0%
Average
76.6%
Statistics
Correct
1
Incorrect
2
Unanswered
0
Accuracy
33.3%
Average
76.6%
Traumatology
Correct
9
Incorrect
7
Unanswered
2
Accuracy
50.0%
Average
79.3%
Urology
Correct
5
Incorrect
2
Unanswered
0
Accuracy
71.4%
Average
80.7%

Question Type Breakdown

Anatomy
Correct
4
Incorrect
2
Unanswered
1
Accuracy
57.1%
Average
78.6%
Biostatistics
Correct
1
Incorrect
3
Unanswered
0
Accuracy
25.0%
Average
79.8%
Diagnosis
Correct
47
Incorrect
36
Unanswered
5
Accuracy
53.4%
Average
79.9%
Epidemiology
Correct
1
Incorrect
4
Unanswered
0
Accuracy
20.0%
Average
76.7%
Ethics
Correct
1
Incorrect
2
Unanswered
0
Accuracy
33.3%
Average
74.1%
Interpretation
Correct
13
Incorrect
20
Unanswered
9
Accuracy
31.0%
Average
70.7%
Legal
Correct
1
Incorrect
3
Unanswered
0
Accuracy
25.0%
Average
64.6%
Pathophysiology
Correct
11
Incorrect
14
Unanswered
2
Accuracy
40.7%
Average
76.1%
Pharmacology
Correct
9
Incorrect
4
Unanswered
0
Accuracy
69.2%
Average
83.3%
Prevention
Correct
9
Incorrect
2
Unanswered
1
Accuracy
75.0%
Average
75.6%
Prognosis
Correct
3
Incorrect
3
Unanswered
1
Accuracy
42.9%
Average
80.8%
Risk
Correct
2
Incorrect
2
Unanswered
1
Accuracy
40.0%
Average
85.2%
Tests
Correct
14
Incorrect
11
Unanswered
2
Accuracy
51.9%
Average
77.9%
Treatment
Correct
44
Incorrect
29
Unanswered
8
Accuracy
54.3%
Average
77.3%
#AnswerCorrectStatus
1BB
2A
3CC
4AB
5AA
6CC
7BC
8A
9CA
10D
11DD
12DD
13BB
14DD
15BAnnulled
16CB
17B
18A
19BC
20BA
21CB
22BD
23AC
24DD
25C
26BAnnulled
27DC
28DAnnulled
29CD
30BB
31CD
32AA
33DD
34AD
35BB
36CD
37DC
38BC
39CD
40BA
41DD
42BC
43BB
44DD
45BD
46AA
47DA
48AA
49DD
50BB
51DC
52B
53DD
54DB
55DA
56BAnnulled
57CC
58BB
59DD
60AA
61BA
62DD
63BB
64DD
65BA
66AA
67AB
68BB
69AB
70CA
71DD
72CA
73CD
74DC
75AA
76BB
77BB
78BB
79DC
80C
81BC
82AD
83AB
84DD
85DC
86AC
87DA
88DD
89BB
90DA
91BB
92DC
93BB
94CC
95DA
96CC
97DD
98DC
99BA
100BC
101DB
102AD
103CA
104C
105CA
106AC
107BB
108DD
109BB
110CC
111AA
112CC
113BB
114DD
115D
116C
117DA
118DD
119CC
120DB
121AD
122CC
123CC
124CC
125BD
126BD
127BB
128DD
129AA
130DD
131BD
132A
133BB
134CC
135BB
136CC
137BA
138AD
139DD
140BB
141BA
142A
143DB
144B
145DD
146CC
147B
148DA
149CA
150AD
151AA
152BA
153BB
154BB
155B
156CC
157AA
158DC
159BC
160AA
161DA
162Annulled
163BD
164CC
165CA
166CB
167CC
168DD
169DB
170BB
171CC
172BA
173DA
174DB
175CB
176CC
177BC
178BA
179DD
180AA
181AB
182CC
183B
184BB
185DB
186AAnnulled
187CC
188DD
189DD
190BA
191BB
192AA
193CC
194DA
195CA
196AA
197BB
198CC
199DD
200CC
201BB
202BA
203DD
204CC
205BB
206CD
207AA
208AC
209AC
210B