Remembering Friedrich "Fritz" Leisch

Bettina Grün; Kurt Hornik; Torsten Hothorn; Theresa Scharl; Achim Zeileis

doi:10.32614/RJ-2024-001

1 Career

Friedrich Leisch (see Figure 1) was born 1968 in Vienna (Austria) and died after serious illness in 2024 in Vienna. Everyone called him Fritz.

Figure 1: Fritz Leisch at his inaugural lecture at BOKU in 2011. Source: BOKU.

Starting in 1987, Fritz studied Applied Mathematics at Technische Universität Wien (TU Wien), earning his master’s degree (Dipl.-Ing.) in 1993. Subsequently, he joined the Department of Statistics and Probability Theory at TU Wien as an assistant professor which he continued to be, with short intermissions, until 2006. During this time he also defended his doctoral thesis in Applied Mathematics (Dr.techn.) in 1999 and earned his habilitation (venia docendi) in Statistics in 2005.

In 1995, he visited the Knowledge-Based Engineering Systems Group at the University of South-Australia in Adelaide on a Kurt Gödel scholarship for postgraduate studies. From 1997 to 2004 he was a member of the SFB project “Adaptive Information Systems and Modeling in Economics and Management Science”, coordinated at Wirtschaftsuniversität Wien (WU Wien). From 2002 to 2003 he was assistant professor at the Department of Statistics and Decision Support Systems, Universität Wien.

In 2006 Fritz moved to Munich, Germany, to become a professor for computational statistics at the Department of Statistics, Ludwig-Maximilians-Universität München (LMU), see Figure 2. He returned to Vienna in 2011 to join the BOKU University as head of the Institute of Statistics, see Figure 3.

Computational statistics group at LMU in 2007 (left to right): Sebastian Kaiser, Adrian Duffner, Manuel Eugster, Fritz Leisch. Source: Carolin Strobl.

Figure 2: Computational statistics group at LMU in 2007 (left to right): Sebastian Kaiser, Adrian Duffner, Manuel Eugster, Fritz Leisch. Source: Carolin Strobl.

Institute of Statistics at BOKU in 2022 (left to right, back to front): Johannes Laimighofer, Nur Banu Özcelik, Ursula Laa, Fritz Leisch, Bernhard Spangl, Gregor Laaha, Matthias Medl. Robert Wiedermann, Lena Ortega Menjivar, Theresa Scharl, Melati Avedis. Source: BOKU.

Figure 3: Institute of Statistics at BOKU in 2022 (left to right, back to front): Johannes Laimighofer, Nur Banu Özcelik, Ursula Laa, Fritz Leisch, Bernhard Spangl, Gregor Laaha, Matthias Medl. Robert Wiedermann, Lena Ortega Menjivar, Theresa Scharl, Melati Avedis. Source: BOKU.

2 Key contributions

Fritz’ scientific contributions span an impressive range including theoretical and methodological work (especially in the field of clustering and finite mixture models) over software (mostly related to the R programming language) to applied work and cooperations (notably in marketing, biotechnology, and genomics, among many others). In the following sections we try to highlight his key contributions and scientific legacy.

2.1 R Core & CRAN

During his stay in Australia, Fritz had learned about the existence of R. Back in Austria, he and Kurt started to explore this potentially good news more systematically. They soon stopped further work on a statistics toolbox they had developed for Octave (Eaton et al. 2024), and switched to R for their applied work, finding lots of room for further improvement, and thus sending polite emails with patches and more suggestions to Ross Ihaka and Robert Gentleman. Clearly these were acceptable in quality but too high in quantity, and it did not take very long that Ross and Robert gave Fritz and Kurt write access to the R sources (initially in CVS, then moved to SVN), and in 1997, they both officially became very early members of the R Core Team.

One of the main challenges then was that the functionality provided by R was rather limited. Contributed extensions for S were available from the Carnegie Mellon University Statlib S Archive¹, and could typically be ported to R rather easily, but there was no mechanism for conveniently distributing or actually using these extensions. This fundamentally changed, when in 1997 Fritz and Kurt implemented the R package management system, using ideas from Debian’s APT (advanced package tool, https://wiki.debian.org/AptCLI) they had successfully employed for managing their computer systems. They also set up the Comprehensive R Archive Network (CRAN, https://CRAN.R-project.org/, see also Hornik 2012) as a means for redistributing R and its contributed extensions, and infrastructure for quality assurance of these extensions. These two contributions paved the way for the amazing growth and success of R through its wealth of high-quality contributed extensions. See https://stat.ethz.ch/pipermail/r-announce/1997/000001.html for the first announcement of CRAN, starting with 12 extension packages. Currently, there are more than 21,000. See Figure 4 for a screenshot² of the landing page of the CRAN master site at TU Wien, as last modified by Fritz on 1997-12-09.

Screenshot of the landing page of the CRAN master site at TU Wien on 1998-01-10, as last modified by Fritz on 1997-12-09. Source: Internet Archive.

Figure 4: Screenshot of the landing page of the CRAN master site at TU Wien on 1998-01-10, as last modified by Fritz on 1997-12-09. Source: Internet Archive.

The first SVN commit by Fritz is from 1997-10-02, the last from 2013-10-04. Overall, there are 651 commits by Fritz, mostly from the early years of R Core, and related to the R package management and CRAN mirror system, and the addition of the Sweave system (see Section 2.3 for more details).

2.2 DSC & useR! conferences

With establishing CRAN in Vienna at TU Wien, Fritz and Kurt laid the foundation for a special relationship between Vienna and R that they characterized as a story of “love and marriage” (Hornik and Leisch 2002). In the decade after the creation of CRAN a number of seminal R-related meetings took place in Vienna, co-organized by Fritz as well as several of the co-authors of this paper.

The first workshop on “Distributed Statistical Computing” (DSC) took place from March 19-23, 1999, at TU Wien. The main motivations were bringing together the R Core Team for its first face-to-face meeting, discussing the roadmap for the release of R 1.0.0, as well as exploring potential synergies with other environments for statistical computing. There were around 30 participants and about 20 presentations, many of which were relatively short, leaving ample time for discussions (see Figure 5).

Discussions at DSC 1999 (top to bottom, left to right): Thomas Lumley, Fritz Leisch, Luke Tierney. Peter Dalgaard, Ross Ihaka, Paul Murrell. Brian Ripley, Martin Mächler, Robert Gentleman, Kurt Hornik. Source: Douglas Bates (DSC 1999 homepage).

Figure 5: Discussions at DSC 1999 (top to bottom, left to right): Thomas Lumley, Fritz Leisch, Luke Tierney. Peter Dalgaard, Ross Ihaka, Paul Murrell. Brian Ripley, Martin Mächler, Robert Gentleman, Kurt Hornik. Source: Douglas Bates (DSC 1999 homepage).

Two more DSC workshops were organized at TU Wien in 2001 and 2003. While meetings focusing on R development issues (with the R Core Team and everyone else interested) were still an important part of these conferences, they also saw an increasing number of regular conference presentations on R packages and their different fields of application (e.g., establishing infrastructure for spatial data). In 2001 there were around 60 participants and about 30 presentations, most with corresponding papers in the online proceedings (Hornik and Leisch 2001). In 2003 this increased to more than 150 participants and about 60 presentations, again with the majority in the online proceedings (Hornik et al. 2003).

The high demand for a platform, where R users from different fields could exchange ideas, prompted the creation of a new conference series called useR!. The first two installments again took place in Vienna in 2004 at TU Wien and in 2006 at WU Wien. Torsten Hothorn, David Meyer, and Achim Zeileis took the lead in the organization with support and advice from Fritz and Kurt in the background. An important contribution from the R Core Team at the useR! conferences were keynote lectures highlighting important developments, e.g., a keynote given by Fritz at useR! 2004 on S4 classes and methods. Both conferences continued the success of the earlier DSC workshops with the number of participants rising to more than 200 in 2004 and close to 350 in 2006. Similarly, the number of presentations grew to about 100 in 2004 and more than 150 in 2006.

In addition to the efforts initiated by Fritz and Kurt, another key factor to the success of these meetings was the city of Vienna with its culture, cafes, wine and beer pubs, etc. (see Hornik and Leisch 2002 and also Figure 6).

Figure 6: Conference dinner at useR! 2006 (left to right): Fritz Leisch, Torsten Hothorn, Tim Hesterberg. Source: Carolin Strobl (useR! 2006 homepage).

2.3 Sweave & reproducibility

With Sweave (Leisch 2002), Fritz pioneered what we now can understand as the technical foundation of reproducible research. Sweave was the main inspiration for knitr (Xie 2015) which in turn led to rmarkdown (Xie et al. 2018) and quarto (Scheidegger et al. 2024). All these systems are used today to generate countless scientific articles, package vignettes, webpages, books, blogs, and much more in a dynamic and reproducible way.

Of course, Fritz was not the first one going in this direction. The concept of “literate programming” had been introduced by Knuth (1984), allowing to combine the source code for software and the corresponding documentation in the same file. The concepts of “tangling”, that is, extracting the code for compilation, and “weaving”, the process of generating a nicely looking document containing code next to prosa and formulae, have their roots in the WEB and CWEB systems (Knuth and Levy 1993). As these packages were specific to code in Pascal (WEB) and C (CWEB), respectively, and documentation in LaTeX, Ramsey (1994) introduced his noweb system as a literate programming tool that is agnostic to the programming language used and also supports HTML in addition to LaTeX and a few other backends for documentation. The noweb syntax for code chunks is:

<<code>>=
1 + 2
@

This will look familiar to users of Sweave. From this history, the naming decisions for the software and its file format can be understood: Sweave is the function that weaves code in S (or R - both languages still existed side by side at the time) with its output and documentation. And Rnw stands for files mixing R code with noweb syntax.

Starting in the mid-1990s to the early 2000s, interests shifted from just “literate programming” to “literate data analysis” (Leisch 2002; Leisch and Rossini 2003) as a core ingredient for reproducible research (Buckheit and Donoho 1995). The seminal new idea was to have dynamic documents so outputs of code such as figures and tables could be updated automatically when the underlying data changed, which was pioneered by the late Günter Sawitzki in his Voyager system (Sawitzki 1996).

Fritz amalgamated all of this into Sweave which was the first time that the power of dynamic reporting became easily available in a widely-used programming language for statistics in combination with the standard textprocessing system LaTeX. This turned out to be a “killer feature” of R at the time and the basis for further work towards reproducible research (Hothorn and Leisch 2011; Stodden et al. 2014).

Sweave was also the basis for R package vignettes (Leisch 2003) as an addition to the previously available technical manual pages. The first R package vignette published on CRAN in May 2002 was in the strucchange package, providing methods for testing, monitoring, and dating structural changes. The vignette was the Sweave adaptation of an introduction to the package that had been co-authored by Fritz and published a couple of months earlier in the Journal of Statistical Software (Zeileis et al. 2002). See Figure 7 for how Fritz used it to illustrate the idea of package vignettes in Leisch (2003) and that the R code from vignettes can be easily extracted (also interactively), explored, and re-run.

Screenshot of the strucchange package vignette, shown in a PDF viewer (right), along with the vExplorer from Bioconductor for interactive code execution (top left) with output in the active R graphics window (bottom left). Source: Leisch (2003, Figure 2).

Figure 7: Screenshot of the strucchange package vignette, shown in a PDF viewer (right), along with the vExplorer from Bioconductor for interactive code execution (top left) with output in the active R graphics window (bottom left). Source: Leisch (2003, Figure 2).

2.4 Clustering & mixture models

Fritz’ theoretical and methodological work focused in particular on clustering and finite mixture models. Centroid-based partitioning methods as well as finite mixture models allow that their fitting algorithm is embedded in a common estimation framework. In this framework, each of the steps is adapted in a modular way depending on the specific setup, e.g., the distance and centroid determining method or the component distribution used. Fritz exploited this for the implementation of the packages flexclust (Leisch 2006) and flexmix (Leisch 2004; Grün and Leisch 2008), contributing to the clustering tools available for R (see the CRAN Task View Cluster). Both packages provide general infrastructure for (model-based) clustering and enable rapid prototyping and the simple extension to new variants taking into account complicated data structures or challenging model specifications (see, for example, psychomix, Frick et al. 2012).

2.5 Applied work

For many years, Fritz and Kurt actively participated in the Biological Psychiatry working group at Medizinische Universität Wien. The first paper co-authored by Fritz dates from 2000 (Bailer et al. 2000), the last from 2023 (Solmi et al. 2023). The joint research was mostly focused on linking genetic traits to psychiatric disorders and treatment success. This prompted many enhancements in the classical test infrastructure in base R - in surprising ways to some reviewers, who could not believe that Fisher’s test really worked for tables with more than two rows or columns. It also established a strong need for conveniently reporting the results of the statistical analyses to the medical doctors in the group that went beyond providing annotated transcripts, which Fritz eventually managed to satisfy by inventing the Sweave system (see Section 2.3).

Fritz also intensively collaborated with Sara Dolnicar to advance data analytic methods for data-driven market segmentation analysis. They received the Charles R. Goeldner Article of Excellence Award for their work on extracting stable Winter tourist segments in Austria with bagged clustering (Dolnicar and Leisch 2003). They focused on the evaluation of data structure and the selection of suitable segments based on segment stability as a key criterion (Dolnicar and Leisch 2010, 2017). Finally, this joint work resulted in Dolnicar et al. (2018) which provides practical guidance for users of market segmentation solutions and for data analysts with respect to the technical and statistical aspects of market segmentation analysis.

As head of the Institute of Statistics, Fritz was involved in various interdisciplinary research projects covering almost the whole range of core areas of research at BOKU. He was key researcher at the Austrian Centre of Industrial Biotechnology (acib) (Scharl et al. 2009; Melcher et al. 2017) and faculty member of the doctoral schools on agricultural genomics and bioprocess engineering. Among others he contributed to the fields of zoology (Cech et al. 2022), forestry, transportation and tourism (Taczanowska et al. 2023) as well as chemistry, genomics and wildlife biology (Steiner et al. 2014).

3 Academic service

In addition to the services for the various conferences and proceedings already described above, he served the scientific community in various ways. In January 2001, he co-created R News which evolved into The R Journal eight years later. For the journal Computational Statistics he was an associate editor from 2005 to 2006 before he became editor-in-chief from 2007 to 2011 (see Symanzik et al. 2024 for more details). Other notable contributions include being editor for the Journal of Statistical Software, core member of the Bioconductor project for statistical software in bioinformatics, and first secretary general of the R Foundation for Statistical Computing when it was formed in 2002.

4 Teaching & mentoring

Fritz taught generations of students at bachelor, master, and PhD level and introduced hundreds of useRs to proper R development in his “Introduction to R Programming” short course. At TU Wien, LMU, and BOKU, he taught courses in applied statistics, statistical computing and computational statistics. He had the ability to explain even difficult content in a simple way and to inspire students with statistics and programming with R. He co-founded the “Munich R Courses” lecture series and was part of a group aiming to initiate a formal PhD program in statistics at LMU.

Fritz supervised Bettina Grün, Theresa Scharl, Sebastian Kaiser, Manuel Eugster, Christina Yassouridis, Rainer Dangl, Weksi Budiaji, Muhammad Atif and Simona Jokubauskaite as his PhD students. Based on his research, Fritz often discussed the state of and the need for reproducible research and taught his many students how to avoid the many small and innocent errors that have a tendency to pile up and invalidate reported statistical results, with potentially devastating consequences, as we all know.

5 Odds & ends

Fritz loved cooking, music, motorbike riding, playing cards with his friends, skiing and hiking. A late afternoon call to his office asking him to go along for a beer in Munich’s English Garden almost never went unanswered, positively. Back in Vienna at BOKU, colleagues got to know Fritz as a very structured, thoughtful, calm person who involved everyone, listened to everyone and always endeavored to balance interests and ensure fairness. He strengthened cooperation and cohesion with his leadership style. Fritz was a friendly, always modest person who was free of airs and graces or vanity, despite or perhaps because of his great scientific successes. The R Core Team and the R community at large miss a contributor, collaborator, teacher, colleague, and friend.

5.1 CRAN packages used

knitr, rmarkdown, quarto, strucchange, flexclust, flexmix

5.2 CRAN Task Views implied by cited packages

Cluster, Econometrics, Environmetrics, Finance, Psychometrics, ReproducibleResearch, TimeSeries

U. Bailer, F. Leisch, K. Meszaros, E. Lenzinger, U. Willinger, R. Strobl, C. Gebhardt, E. Gerhard, K. Fuchs, W. Sieghart, et al. Genome scan for susceptibility loci for schizophrenia. Neuropsychobiology, 42(4): 175–182, 2000. DOI 10.1159/000026690.

J. B. Buckheit and D. L. Donoho. WaveLab and reproducible research. In Wavelets in statistics, Eds A. Antoniadis and G. Oppenheim pages. 55–82 1995. New York: Springer-Verlag. DOI 10.1007/978-1-4612-2544-7_5.

R. M. Cech, S. Jovanovic, S. Kegley, K. Hertoge, F. Leisch and J. G. Zaller. Reducing overall herbicide use may reduce risks to humans but increase toxic loads to honeybees, earthworms and birds. Environmental Sciences Europe, 34(1): 44, 2022. DOI 10.1186/s12302-022-00622-2.

S. Dolnicar, B. Grün and F. Leisch. Market segmentation analysis: Understanding it, doing it, and making it useful. Springer-Verlag, 2018. DOI 10.1007/978-981-10-8818-6.

S. Dolnicar and F. Leisch. Evaluation of structure and reproducibility of cluster solutions using the bootstrap. Marketing Letters, 21(1): 83–101, 2010. DOI 10.1007/s11002-009-9083-4.

S. Dolnicar and F. Leisch. Using segment level stability to select target segments in data-driven market segmentation studies. Marketing Letters, 28(3): 423–436, 2017. DOI 10.1007/s11002-017-9423-8.

S. Dolnicar and F. Leisch. Winter tourist segments in Austria: Identifying stable vacation styles using bagged clustering techniques. Journal of Travel Research, 41(3): 281–292, 2003. DOI 10.1177/0047287502239037.

J. W. Eaton, D. Bateman, S. Hauberg and R. Wehbring. GNU Octave version 9.2.0 manual: A high-level interactive language for numerical computations. 2024. URL https://www.gnu.org/software/octave/doc/v9.2.0/.

H. Frick, C. Strobl, F. Leisch and A. Zeileis. Flexible Rasch mixture models with package psychomix. Journal of Statistical Software, 48(7): 1–25, 2012. DOI 10.18637/jss.v048.i07.

B. Grün and F. Leisch. FlexMix version 2: Finite mixtures with concomitant variables and varying and constant parameters. Journal of Statistical Software, 28(4): 1–35, 2008. DOI 10.18637/jss.v028.i04.

K. Hornik. The Comprehensive R Archive Network. Wiley Interdisciplinary Reviews: Computational Statistics, 4(4): 394–398, 2012. DOI 10.1002/wics.1212.

K. Hornik and F. Leisch, eds. Proceedings of the 2nd International Workshop on Distributed Statistical Computing, Vienna, Austria. 2001. URL https://www.R-project.org/conferences/DSC-2001/Proceedings/. ISSN 1609-395X.

K. Hornik and F. Leisch. Vienna and R: Love, marriage and the future. In Festschrift 50 Jahre Österreichische Statistische Gesellschaft, Ed R. Dutter pages. 61–70 2002. Österreichische Statistische Gesellschaft. ISSN 1026-597X.

K. Hornik, F. Leisch and A. Zeileis, eds. Proceedings of the 3rd International Workshop on Distributed Statistical Computing, Vienna, Austria. 2003. URL https://www.R-project.org/conferences/DSC-2003/Proceedings/. ISSN 1609-395X.

T. Hothorn and F. Leisch. Case studies in reproducibility. Briefings in Bioinformatics, 12(3): 288–300, 2011. DOI 10.1093/bib/bbq084.

D. E. Knuth. Literate programming. The Computer Journal, 27(2): 97–111, 1984. DOI 10.1093/comjnl/27.2.97.

D. E. Knuth and S. Levy. The CWEB system of structured documentation. Reading: Addison-Wesley, 1993.

F. Leisch. A toolbox for k-centroids cluster analysis. Computational Statistics and Data Analysis, 51(2): 526–544, 2006. DOI 10.1016/j.csda.2005.10.006.

F. Leisch. FlexMix: A general framework for finite mixture models and latent class regression in R. Journal of Statistical Software, 11(8): 1–18, 2004. DOI 10.18637/jss.v011.i08.

F. Leisch. Sweave, part II: Package vignettes. R News, 3(2): 21–24, 2003. URL https://CRAN.R-project.org/doc/Rnews/.

F. Leisch. Sweave: Dynamic generation of statistical reports using literate data analysis. In COMPSTAT 2002 – proceedings in computational statistics, Eds W. Härdle and B. Rönz pages. 575–580 2002. Heidelberg: Physica Verlag. DOI 10.1007/978-3-642-57489-4_89.

F. Leisch and A. J. Rossini. Reproducible statistical research. Chance, 16(2): 46–50, 2003. DOI 10.1080/09332480.2003.10554848.

M. Melcher, T. Scharl, M. Luchner, G. Striedner and F. Leisch. Boosted structured additive regression for Escherichia coli fed-batch fermentation modeling. Biotechnology and Bioengineering, 114(2): 321–334, 2017. DOI 10.1002/bit.26073.

N. Ramsey. Literate programming simplified. IEEE Software, 11(5): 97–105, 1994. DOI 10.1109/52.311070.

G. Sawitzki. Extensible statistical software: On a voyage to Oberon. Journal of Computational and Graphical Statistics, 5(3): 263–283, 1996. DOI 10.1080/10618600.1996.10474711.

T. Scharl, I. Voglhuber and F. Leisch. Exploratory and inferential analysis of gene cluster neighborhood graphs. BMC Bioinformatics, 10(1): 288, 2009. DOI 10.1186/1471-2105-10-288.

C. Scheidegger, C. Teague, C. Dervieux, J. J. Allaire and Y. Xie. Quarto: An open-source scientific and technical publishing system. 2024. URL https://quarto.org/. Version 1.5.

M. Solmi, T. Thompson, A. Estradé, A. Agorastos, J. Radua, S. Cortese, E. Dragioti, F. Leisch, D. Vancampfort, L. C. Thygesen, et al. Validation of the Collaborative Outcomes study on Health and Functioning during Infection Times (COH-FIT) questionnaire for adults. Journal of Affective Disorders, 326: 249–261, 2023. DOI 10.1016/j.jad.2022.12.022.

W. Steiner, F. Leisch and K. Hackländer. A review on the temporal pattern of deer-vehicle accidents: Impact of seasonal, diurnal and lunar effects in cervids. Accident Analysis & Prevention, 66: 168–181, 2014. DOI 10.1016/j.aap.2014.01.020.

V. Stodden, F. Leisch and R. D. Peng. Implementing reproducible research. Boca Raton: Chapman & Hall/CRC, 2014.

J. Symanzik, Y. Mori and P. Vieu. A memorial for the late Professor Friedrich Leisch. Computational Statistics, 39: 2024. Forthcoming.

K. Taczanowska, B. Latosinska, C. Brandenburg, F. Leisch, C. Czachs and A. Muhar. Lobbying in social media as a new source of survey bias. Journal of Outdoor Recreation and Tourism, 44(A): 100689, 2023. DOI 10.1016/j.jort.2023.100689.

Y. Xie. Dynamic documents with R and knitr. 2nd ed Boca Raton: Chapman & Hall/CRC, 2015. DOI 10.1201/9781315382487.

Y. Xie, J. J. Allaire and G. Grolemund. R Markdown: The definitive guide. Boca Raton: Chapman & Hall/CRC, 2018. DOI 10.1201/9781138359444.

A. Zeileis, F. Leisch, K. Hornik and C. Kleiber. strucchange: An R package for testing for structural change in linear regression models. Journal of Statistical Software, 7(2): 1–38, 2002. DOI 10.18637/jss.v007.i02.

Unfortunately, the Statlib S Archive is currently not available anymore. A snapshot, including many of the actual source code files, is available on the Internet Archive at https://web.archive.org/web/20000815063825/http://lib.stat.cmu.edu/S/.↩︎
This is from the earliest capture, from 1998-01-10, available on the Internet Archive at https://web.archive.org/web/19980110082558/http://www.ci.tuwien.ac.at/R/contents.html.↩︎

Remembering Friedrich “Fritz” Leisch