Supplementary materials are available in addition to this article. It can be downloaded at
RJ-2024-024.zip
A. Adadi and M. Berrada. Peeking inside the black-box: A survey on explainable artificial intelligence (
XAI).
IEEE Access, 6: 52138–52160, 2018. URL
https://doi.org/10.1109/access.2018.2870052 .
D. W. Apley and J. Zhu. Visualizing the effects of predictor variables in black box supervised learning models.
Journal of the Royal Statistical Society Series B: Statistical Methodology, 82(4): 1059–1086, 2020. URL
https://doi.org/10.1111/rssb.12377.
V. Arel-Bundock.
Marginaleffects: Predictions, comparisons, slopes, marginal means, and hypothesis tests. 2023. URL
https://CRAN.R-project.org/package=marginaleffects. R package version 0.11.1.
S. Athey and G. W. Imbens. Machine learning methods that economists should know about.
Annual Review of Economics, 11(1): 685–725, 2019. URL
https://doi.org/10.1146/annurev-economics-080217-053433
.
T. Bartus. Estimation of marginal effects using margeff. The Stata Journal, 5(3): 309–329, 2005.
M. Borkovec and N. Madin.
Ggparty: ’Ggplot’ visualizations for the ’partykit’ package. 2019. URL
https://CRAN.R-project.org/package=ggparty. R package version 1.0.0.
A.-L. Boulesteix, M. N. Wright, S. Hoffmann and I. R. König. Statistical learning approaches in the genetic epidemiology of complex diseases.
Human Genetics, 139(1): 73–84, 2020. URL
https://doi.org/10.1007/s00439-019-01996-9.
L. Breiman. Statistical modeling: The two cultures (with comments and a rejoinder by the author).
Statistical Science, 16(3): 199–231, 2001. URL
https://doi.org/10.1214/ss/1009213726.
M. Britton. VINE: Visualizing statistical interactions in black box models. 2019. URL
https://doi.org/10.48550/arXiv.1904.00561.
G. Casalicchio, C. Molnar and B. Bischl. Visualizing the feature importance for black box models. In
Machine learning and knowledge discovery in databases, Eds M. Berlingerio, F. Bonchi, T. Gärtner, N. Hurley and G. Ifrim pages. 655–670 2019. Cham: Springer International Publishing. URL
https://doi.org/10.1007/978-3-030-10925-7_40.
W. Chang.
R6: Encapsulated classes with reference semantics. 2021. URL
https://CRAN.R-project.org/package=R6. R package version 2.5.1.
I. C. Covert, S. Lundberg and S.-I. Lee. Understanding global feature contributions with additive importance measures. In Proceedings of the 34th international conference on neural information processing systems, 2020. Red Hook, NY, USA: Curran Associates Inc.
P. D. Dueben and P. Bauer. Challenges and design choices for global weather and climate models based on machine learning.
Geoscientific Model Development, 11(10): 3999–4009, 2018. URL
https://doi.org/10.5194/gmd-11-3999-2018.
D. B. Dwyer, P. Falkai and N. Koutsouleris. Machine learning approaches for clinical psychology and psychiatry.
Annual Review of Clinical Psychology, 14(1): 91–118, 2018. URL
https://doi.org/10.1146/annurev-clinpsy-032816-045037 .
H. Fanaee-T.
Bike Sharing Dataset. 2013. URL
https://doi.org/10.24432/C5W894.
J. H. Friedman. Greedy function approximation: A gradient boosting machine.
Ann. Statist., 29(5): 1189–1232, 2001. URL
https://doi.org/10.1214/aos/1013203451.
E. Gamma, R. Helm, R. Johnson and J. M. Vlissides. Design patterns: Elements of reusable object-oriented software. 1st ed Addison-Wesley Professional, 1994.
A. Goldstein, A. Kapelner, J. Bleich and E. Pitkin. Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation.
Journal of Computational and Graphical Statistics, 24(1): 44–65, 2015. URL
https://doi.org/10.1080/10618600.2014.907095.
W. Greene.
Econometric analysis. 8th ed Pearson International, 2019.
J. Herbinger, B. Bischl and G. Casalicchio. REPID: Regional effect plots with implicit interaction detection. In Proceedings of the 25th international conference on artificial intelligence and statistics, Eds G. Camps-Valls, F. J. R. Ruiz and I. Valera pages. 10209–10233 2022. PMLR.
G. Hooker. Diagnosing extrapolation: Tree-based density estimation. In Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, pages. 569–574 2004a. New York, NY, USA: Association for Computing Machinery.
G. Hooker. Discovering additive structure in black box functions. In
Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, pages. 575–580 2004b. New York, NY, USA: ACM. URL
http://doi.acm.org/10.1145/1014052.1014122.
G. Hooker. Generalized functional
ANOVA diagnostics for high-dimensional functions of dependent variables.
Journal of Computational and Graphical Statistics, 16(3): 709–732, 2007. URL
https://doi.org/10.1198/106186007X237892.
G. Hooker, L. Mentch and S. Zhou. Unrestricted permutation forces extrapolation: Variable importance requires at least one more model, or there is no free variable importance.
Statistics and Computing, 31(6): 82, 2021. URL
https://doi.org/10.1007/s11222-021-10057-z.
T. Hothorn and A. Zeileis. Partykit: A modular toolkit for recursive partytioning in R. Journal of Machine Learning Research, 16(118): 3905–3909, 2015.
U. Kamath and J. Liu. Introduction to interpretability and explainability. In
Explainable artificial intelligence: An introduction to interpretable machine learning, pages. 1–26 2021. Cham: Springer International Publishing. URL
https://doi.org/10.1007/978-3-030-83356-5_1.
M. Kuhn and D. Vaughan.
Parsnip: A common API to modeling and analysis functions. 2023. URL
https://CRAN.R-project.org/package=parsnip. R package version 1.1.1.
T. J. Leeper.
Margins: Marginal effects for model objects. 2018. URL
https://CRAN.R-project.org/package=margins.
R package version 0.3.23.
A. Liaw and M. Wiener. Classification and regression by randomForest.
R News, 2(3): 18–22, 2002. URL
https://CRAN.R-project.org/doc/Rnews/.
D. Lüdecke. Ggeffects: Tidy data frames of marginal effects from regression models.
Journal of Open Source Software, 3(26): 772, 2018. URL
https://doi.org/10.21105/joss.00772.
S. M. Lundberg and S.-I. Lee. A unified approach to interpreting model predictions. In Proceedings of the 31st international conference on neural information processing systems, pages. 4768–4777 2017. Red Hook, NY, USA: Curran Associates Inc.
C. J. McCabe, M. A. Halvorson, K. M. King, X. Cao and D. S. Kim. Interpreting interaction effects in generalized linear models of nonlinear probabilities and counts.
Multivariate Behavioral Research, 57(2-3): 243–263, 2022. URL
https://doi.org/10.1080/00273171.2020.1868966.
N. Mehrabi, F. Morstatter, N. Saxena, K. Lerman and A. Galstyan. A survey on bias and fairness in machine learning.
ACM Comput. Surv., 54(6): 2021. URL
https://doi.org/10.1145/3457607.
T. D. Mize, L. Doan and J. S. Long. A general framework for comparing predictions and marginal effects across models.
Sociological Methodology, 49(1): 152–189, 2019. URL
https://doi.org/10.1177/0081175019852763.
C. Molnar.
Interpretable machine learning: A guide for making black box models explainable. 2nd ed 2022. URL
https://christophm.github.io/interpretable-ml-book.
C. Molnar, B. Bischl and G. Casalicchio. Iml: An
R package for interpretable machine learning.
JOSS, 3(26): 786, 2018. URL
https://doi.org/10.21105/joss.00786.
C. Molnar, G. König, B. Bischl and G. Casalicchio. Model-agnostic feature importance and effects with dependent features: A conditional subgroup approach.
Data Mining and Knowledge Discovery, 38(5): 2903–2941, 2024. URL
https://doi.org/10.1007/s10618-022-00901-9.
C. Molnar, G. König, J. Herbinger, T. Freiesleben, S. Dandl, C. A. Scholbeck, G. Casalicchio, M. Grosse-Wentrup and B. Bischl. General pitfalls of model-agnostic interpretation methods for machine learning models. In
xxAI - beyond explainable AI. xxAI 2020. Lecture notes in computer science, vol 13200, Eds A. Holzinger, R. Goebel, R. Fong, T. Moon, K.-R. Müller and W. Samek 2022. Cham: Springer. URL
https://doi.org/10.1007/978-3-031-04083-2_4.
S. Mullainathan and J. Spiess. Machine learning: An applied econometric approach.
Journal of Economic Perspectives, 31(2): 87–106, 2017. URL
https://doi.org/10.1257/jep.31.2.87.
E. Onukwugha, J. Bergtold and R. Jain. A primer on marginal effects—part
I: Theory and formulae.
PharmacoEconomics, 33(1): 25–30, 2015. URL
https://doi.org/10.1007/s40273-014-0210-6.
A. Rajkomar, J. Dean and I. Kohane. Machine learning in medicine.
New England Journal of Medicine, 380(14): 1347–1358, 2019. URL
https://doi.org/10.1056/NEJMra1814259.
M. T. Ribeiro, S. Singh and C. Guestrin. "
Why should
I trust you?": Explaining the predictions of any classifier. In
Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages. 1135–1144 2016. New York, NY, USA: Association for Computing Machinery. URL
https://doi.org/10.1145/2939672.2939778.
C. A. Scholbeck, G. Casalicchio, C. Molnar, B. Bischl and C. Heumann. Marginal effects for non-linear prediction functions.
Data Mining and Knowledge Discovery, 38(5): 2997–3042, 2024. URL
https://doi.org/10.1007/s10618-023-00993-x.
C. A. Scholbeck, C. Molnar, C. Heumann, B. Bischl and G. Casalicchio. Sampling, intervention, prediction, aggregation: A generalized framework for model-agnostic interpretations. In
Machine learning and knowledge discovery in databases: International workshops of ECML PKDD 2019, würzburg, germany, september 16–20, 2019, proceedings, part i, Eds P. Cellier and K. Driessens pages. 205–216 2020. Cham: Springer International Publishing. URL
https://doi.org/10.1007/978-3-030-43823-4_18.
C. A. Scholbeck, J. Moosbauer, G. Casalicchio, H. Gupta, B. Bischl and C. Heumann. Position paper: Bridging the gap between machine learning and sensitivity analysis. 2023. URL
https://doi.org/10.48550/arXiv.2312.13234.
StataCorp. Stata: Release 18. College Station, TX: StataCorp LLC., 2023.
E. Štrumbelj and I. Kononenko. An efficient explanation of individual classifications using game theory. Journal of Machine Learning Research, 11(1): 1–18, 2010.
P.-N. Tan, A. Karpatne, M. Steinbach and V. Kumar.
Introduction to Data Mining: Global Edition. Pearson, 2019.
T. Therneau and B. Atkinson.
Rpart: Recursive partitioning and regression trees. 2019. URL
https://CRAN.R-project.org/package=rpart. R package version 4.1-15.
S. Wachter, B. Mittelstadt and C. Russell. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. Harvard Journal of Law and Technology, 31(2): 841–887, 2018.
R. Williams. Using the margins command to estimate and interpret adjusted predictions and marginal effects. Stata Journal, 12(2): 308–331(24), 2012.