MedicalBenchmark
Welcome to MedBench: The Largest Medical Benchmark in Spanish

Welcome to MedBench: The Largest Medical Benchmark in Spanish

Introducing MedBench, a platform to evaluate language models in the medical field using questions from the MIR exam.

Equipo MedBenchJanuary 23, 20242 min read
announcementbenchmarkMIRmedical AI

Introduction

We are pleased to present MedBench, the largest medical benchmark platform focused on evaluating artificial intelligence models using real questions from Spain's MIR (Médico Interno Residente) exam.

Why MedBench?

Evaluating language models in the medical field presents unique challenges:

  • Critical precision: In medicine, errors can have serious consequences
  • Specialized knowledge: Deep understanding of multiple specialties is required
  • Clinical reasoning: Memorization is not enough; you must know how to apply knowledge

Key Features

MIR Questions

We use official MIR exam questions, which guarantees:

  1. Clinical quality and relevance
  2. Coverage of all medical specialties
  3. Different difficulty levels
  4. Constant updates with new exam editions

Detailed Metrics

We evaluate each model across multiple dimensions:

  • Overall accuracy: Percentage of correct answers
  • Net score: Considering penalty for errors
  • Specialty breakdown: Performance in each medical area
  • Confidence level: Model certainty in its responses

Next Steps

We are working on:

  • Expanding the question set
  • Adding more models to the ranking
  • Implementing comparative analyses
  • Developing tools for researchers

Join the Community

If you are a researcher, developer, or medical professional interested in AI applied to health, we invite you to:

Thank you for your interest in MedBench!