2.4. Benchmarking Resources#
Dueben et al.  divide benchmark datasets into scientific and competition benchmark datasets. The paper gives a thorough treatment for people curious about creating new benchmarks for their domain.
Here are some examples for benchmark datasets, also given in the notebook:
Imagenet in computer vision [Deng et al., 2009]
WeatherBench in meteorology [Rasp et al., 2020]
ChestX-ray8 in medical imaging [Wang et al., 2017]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. ImageNet: a large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, June 2009. URL: https://doi.org/10.1109/cvpr.2009.5206848, doi:10.1109/cvpr.2009.5206848.
Peter D. Dueben, Martin G. Schultz, Matthew Chantry, David John Gagne, David Matthew Hall, and Amy McGovern. Challenges and benchmark datasets for machine learning in the atmospheric sciences: definition, status, and outlook. Artificial Intelligence for the Earth Systems, July 2022. URL: https://doi.org/10.1175/aies-d-21-0002.1, doi:10.1175/aies-d-21-0002.1.
Stephan Rasp, Peter D. Dueben, Sebastian Scher, Jonathan A. Weyn, Soukayna Mouatadid, and Nils Thuerey. WeatherBench: a benchmark data set for data-driven weather forecasting. Journal of Advances in Modeling Earth Systems, November 2020. URL: https://doi.org/10.1029/2020ms002203, doi:10.1029/2020ms002203.
Xiaosong Wang, Yifan Peng, Le Lu, Zhiyong Lu, Mohammadhadi Bagheri, and Ronald M. Summers. ChestX-ray8: hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, July 2017. URL: https://doi.org/10.1109/cvpr.2017.369, doi:10.1109/cvpr.2017.369.