ALMA
Verified Results
ALMA has been evaluated across three consecutive MIR exam editions with perfect results verified by MedicalBenchmark.
600/600
Correct answers
Out of all valid questions in MIR 2024, 2025 and 2026
100%
Total accuracy
Zero errors across three consecutive editions
3 years
Consecutive MIR exams
Sustained perfect performance in 2024, 2025 and 2026
~$10.50
Cost per exam
Average processing cost per full exam edition
~53s
Per question
Average response time including full reasoning
~32
Specialized experts
Medical domain agents in the Agentic RAG system
99.8%
Confidence interval
Statistical reliability of the evaluation system
Agentic RAG Architecture
ALMA uses an intelligent orchestrator that coordinates multiple specialized agents to answer medical questions with maximum accuracy. Unlike conventional RAG, the system iterates and validates before responding.
Iterative querying
The orchestrator performs multiple query rounds against the corpus, refining the search until finding the most relevant evidence.
Specialized experts
Approximately 32 domain agents cover all MIR medical specialties, from cardiology to psychiatry.
Synthetic corpus
Knowledge base built from Editorial Medica Panamericana's reference bibliography, processed and optimized for RAG.
English reasoning
The system reasons internally in English to maximize base model performance and responds in the question's language.
Intelligent sub-delegation
Experts can delegate sub-queries to other specialists when a question crosses specialty boundaries, creating dynamic knowledge networks.
Multimodal support
Processing of clinical images (X-rays, ECGs, dermatological photographs) within each expert agent's specialized context.
The central orchestrator is Claude Sonnet 4.5 with extended reasoning, running on Amazon Bedrock in the Aragon region (Spain).
Processing Flow
Multilingual Reasoning Pipeline
Current LLMs have richer internal representations in English. ALMA forces internal reasoning in English to maximize accuracy, always responding in the question's original language.
How It Works
ALMA's process for answering a medical question follows a structured five-step flow.
Question reception
The orchestrator receives the MIR question with its answer options and analyzes the clinical context.
Analysis and planning
Relevant medical specialties are identified and appropriate expert agents are selected.
Corpus querying
Selected agents query Panamericana's synthetic medical corpus to obtain clinical evidence.
Iteration and validation
The orchestrator evaluates collected evidence and, if insufficient, launches additional query rounds.
Synthesis and response
Evidence is synthesized into structured reasoning and the answer with the strongest clinical support is selected.
Technical Innovations
Beyond the general architecture, ALMA incorporates key innovations that contribute to its exceptional performance.
Optimized synthetic corpus
Original medical documents are processed through a pipeline that extracts relevant information, eliminates redundancy, restructures for LLM efficiency, and enriches with cross-specialty relationships.
Incremental updates
System based on Recursive Language Models (RLM) that updates the corpus without rebuilding it, detecting obsolete fragments and integrating new information while maintaining coherence.
Memory tree with sub-delegation
The orchestrator maintains a context tree where each branch corresponds to an expert. Sub-queries inherit relevant context without duplicating tokens, optimizing cost and speed.
Agentic RAG vs Fine-tuning
Unlike fine-tuning which statically modifies model weights, Agentic RAG dynamically queries updated information, enabling continuous improvement without retraining.
Data Sovereignty
ALMA is designed to meet the highest privacy and data sovereignty standards in the European healthcare sector.
EU processing
All processing runs on AWS Bedrock in the Aragon region (Spain), ensuring data never leaves the EU.
No provider access
Anthropic has no access to processed data. AWS Bedrock guarantees complete provider isolation.
GDPR compliance
Designed to comply with the General Data Protection Regulation and European healthcare regulations.
AI Act ready
Architecture aligned with European AI Act requirements for high-risk systems.
ALMA is currently in production at CATSalut (Catalan Health Service), helping healthcare professionals in real clinical environments.
Explore ALMA's results
Check ALMA's detailed performance on each MIR edition, or contact us for more information.