For each FAISS score range, how often did the model's top-5 candidates include the correct answer? A well-calibrated model shows higher match rates at higher scores.
Each time the model is retrained on new confirmed labels, a new version is created here. All tasks retain the version tag of the model that generated them.
| Version | Tasks | Match rate | Avg score | Period | Status |
|---|