2.4. Benchmarking Resources#

Dueben et al. [2022] divide benchmark datasets into scientific and competition benchmark datasets. The paper gives a thorough treatment for people curious about creating new benchmarks for their domain.

Here are some examples for benchmark datasets, also given in the notebook:

2.4.1. Bibliography#

[DDS+09]

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. ImageNet: a large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, June 2009. URL: https://doi.org/10.1109/cvpr.2009.5206848, doi:10.1109/cvpr.2009.5206848.

[DSC+22]

Peter D. Dueben, Martin G. Schultz, Matthew Chantry, David John Gagne, David Matthew Hall, and Amy McGovern. Challenges and benchmark datasets for machine learning in the atmospheric sciences: definition, status, and outlook. Artificial Intelligence for the Earth Systems, July 2022. URL: https://doi.org/10.1175/aies-d-21-0002.1, doi:10.1175/aies-d-21-0002.1.

[RDS+20]

Stephan Rasp, Peter D. Dueben, Sebastian Scher, Jonathan A. Weyn, Soukayna Mouatadid, and Nils Thuerey. WeatherBench: a benchmark data set for data-driven weather forecasting. Journal of Advances in Modeling Earth Systems, November 2020. URL: https://doi.org/10.1029/2020ms002203, doi:10.1029/2020ms002203.

[WPL+17]

Xiaosong Wang, Yifan Peng, Le Lu, Zhiyong Lu, Mohammadhadi Bagheri, and Ronald M. Summers. ChestX-ray8: hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, July 2017. URL: https://doi.org/10.1109/cvpr.2017.369, doi:10.1109/cvpr.2017.369.