Phase 2: Clinical Task Evaluation
Duration: October - November 2025
🚧 Information Coming Soon 🚧
Detailed information about Phase 2 evaluation tasks and procedures will be made available after the competition launch. This maintains the integrity of the competition by preventing participants from specifically optimizing their models for known evaluation tasks.
What We Can Share
Evaluation Framework
- Multi-institutional evaluation across leading cancer centers
- 23 clinically relevant downstream tasks
- Independent assessment ensuring fair comparison
- Standardized evaluation protocols across all submissions
Participating Institutions
The evaluation will be conducted by top-ranked cancer centers:
- Memorial Sloan Kettering Cancer Center (Lead Institution)
- Icahn School of Medicine at Mount Sinai
- State University of New York at Stony Brook & University of North Carolina
- MD Anderson Cancer Center
- Mayo Clinic
Including 4 of the 8 top-ranked cancer centers in the United States and 3 of the 6 top-ranked cancer centers globally
Task Categories
Phase 2 will evaluate foundation models across diverse clinical scenarios:
- Biomarker Prediction: Molecular marker detection from histology
- Cancer Subtyping: Classification of cancer subtypes
- Survival Prediction: Time-to-event analysis for patient prognosis
- Image Retrieval: Content-based search across large pathology archives
Evaluation Approach
- Complete Data Separation: No patient overlap between Phase 1 and Phase 2 datasets
- Standardized Protocols: Consistent tile sampling and assessment procedures
- Independent Evaluation: Each institution evaluates models on their tasks locally
- Fair Comparison: Attention-Based Multiple Instance Learning (ABMIL) used for standardized MIL aggregation wherever required
Gateway Tasks
If more than 20 foundation models are submitted, the top 20 performers on three designated gateway tasks will be prioritized for comprehensive evaluation to meet the NeurIPS 2025 timeline. All submitted models will eventually receive complete evaluation for post-competition publication.
Ranking Methodology
- Task-specific performance metrics (AUCROC, Pearson correlation, concordance index)
- Borda count aggregation across all tasks
- Statistical significance assessment using bootstrap resampling
- Comprehensive leaderboard will be released at the end of Phase 2
Data Privacy and Security
- Phase 2 evaluation datasets will never be released to participants
- Maintains complete data privacy and security
- Ensures no possibility of data leakage or unfair advantage
- Evaluation conducted entirely by organizing institutions
Timeline
Date | Activity |
---|---|
October 15, 2025 | Foundation model submission deadline |
October 16-30, 2025 | Model validation and setup |
November 1-30, 2025 | Multi-institutional evaluation |
December 2025 | Results announcement at NeurIPS |
Post-Competition Publication
All participating teams will:
- Receive co-authorship on the post-competition journal paper
- Have their foundation models publicly released with the publication
- Benefit from perpetual model attribution and citation
- Be part of the comprehensive benchmarking study
Questions about Phase 2?
Contact: slcpfm2025@gmail.com
Stay Updated: Detailed Phase 2 information will be posted here once the competition launches.