5.4. Interpretability Resources#

Machine learning interpretability has a fancier alias, called Explainable AI (XAI). You can read the bookMolnar [2022] for a deep dive into ML interpretability.

There are a few tools available, namely:

5.4.1. Bibliography#


Janis Klaise, Arnaud Van Looveren, Giovanni Vacanti, and Alexandru Coca. Alibi explain: algorithms for explaining machine learning models. Journal of Machine Learning Research, 22(181):1–7, 2021. URL: http://jmlr.org/papers/v22/21-0017.html.


Narine Kokhlikyan, Vivek Miglani, Miguel Martin, Edward Wang, Bilal Alsallakh, Jonathan Reynolds, Alexander Melnikov, Natalia Kliushkina, Carlos Araya, Siqi Yan, and Orion Reblitz-Richardson. Captum: a unified and generic model interpretability library for pytorch. 2020. arXiv:2009.07896.


Scott Lundberg and Su-In Lee. A unified approach to interpreting model predictions. 2017. URL: https://arxiv.org/abs/1705.07874, doi:10.48550/ARXIV.1705.07874.


Christoph Molnar. Interpretable Machine Learning. Leanpub, 2 edition, 2022. URL: https://christophm.github.io/interpretable-ml-book.


F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.


Jason Yosinski, Jeff Clune, Anh Nguyen, Thomas Fuchs, and Hod Lipson. Understanding neural networks through deep visualization. In Deep Learning Workshop, International Conference on Machine Learning (ICML). 2015.