The ‘Conference Report: Why R? 2018’ article from the 2018-2 issue.
Michał Burdukiewicz (Warsaw University of Technology, Why R? Foundation)
, Marta Karas (Johns Hopkins Bloomberg School of Public Health)
, Leon Eyrich Jessen (Technical University of Denmark)
, Marcin Kosiński (Gradient Metrics, Why R? Foundation)
, Bernd Bischl (Ludwig Maximilian University of Munich)
, Stefan (Brandenburg University of Technology Cottbus–Senftenberg)
2018-12-01
1Why R? 2018 conference
The primary purpose of the Why R? 2018 conference was to provide R
programming language enthusiasts with an opportunity to meet and discuss
experiences in R software development and analysis applications, for
both academia and industry professionals. The event was held 2-5 August,
2018 in a city of Wroclaw, a strong academic and business center of
Poland. The total of approximately 250 people from 6 countries attended
the main conference event. Additionally, approximately 540 R users
attended the pre-meetings in eleven cities across Europe
(Figure 2).
Why R? 2018 conference is the continuation of the Why R?’s first
edition that took place Sep 27-29, 2017 at the Warsaw University of
Technology in Warsaw (Poland). Given the success of the first event,
this year’s conference extended its program concept and scope;
importantly, Why R? 2018 conference was held as international.
2 Conference program
The format of the conference was aimed at exposing participants to
recent developments in the R language, as well as a wide range of
application examples. It consisted of workshops, invited talks,
field-specific series of talks, lighting-talks, special interest groups,
and a full-day programming hackathon.
The conference program had a strong focus on machine learning techniques
and applications, with mlr(Bischl, M. Lang, L. Kotthoff, J. Schiffner, J. Richter, E. Studerus, G. Casalicchio, and Z. M. Jones 2016) R package – an interface to a
large number of classification and regression methods – being
emphasized in a number of presentations, as well as employed during
workshops and the hackathon provided by the mlr team. The scope of
conference program included statistical methodology, data visualization,
R code performance, building products based on data analyses, and R’s
role in academia / industry.
The event offered extensive networking opportunities. The cocktail party
was held at the conference venue on the 2nd conference day. In addition,
convenient location in the close proximity of the old town market square
facilitated many informal gatherings that were happening each conference
day.
3Why R? Pre-meetings
The novel idea of pre-meetings has proved to be successful in
popularizing Why R? conference in the international community of R
users. Eleven pre-meetings took place in Czech Republic, Denmark,
Germany, Poland, and Sweden in the run-up to the Why R? main event.
The pre-meetings either constituted a part of another conference, one
day-long workshop and discussion event, or a meeting of a local R user
group.
As R provides a versatile framework for reproducible research in
different scientific domains
(Gentleman and D. Temple Lang 2007; Gandrud 2013; Leeper 2014; Liu and S. Pounds 2014; Rödiger, M. Burdukiewicz, K. A. Blagodatskikh, and P. Schierack 2015),
we considered the Why R? pre-meetings as a great opportunity to convey
and popularize R as an analytics tool in groups of professionals from
different fields. The pre-meeting held at International Biotechnology
Innovation Days (IBID), an open-access conference held 23-25 May, 2018
at the Brandenburg University of Technology Cottbus - Senftenberg
(Senftenberg, Germany)1 is an example where the R came in close
contact with scientist from other domains. IBID brought together
specialists and experts in the fields of bioanalytics, biomedical and
translational research, autoimmune diagnostics, digitalization, and
engineering; hence it posed an excellent platform to promote R and the
Why R? 2018 conference.
4 Workshops
Why R? 2018 conference had a wide portfolio of workshops:
Maps in R by Piotr Sobczyk (OLX Group). Piotr showed how to
create spatial data visualization efficiently in the R. He gave a
plenty of tips to follow, pitfalls to avoid and a number of useful
hacks. Starting from a basic plot function, he covered the usage of
ggplot2 as well as R packages that use interactive javascript
libraries to prepare data reports.
iDash - Make your R slides awesome with xaringan by Mikołaj
Olszewski (iDash) and Mikołaj Bogucki (iDash). The workshop
introduced the xaringan(Xie, C. T. Ekstrøm, D. Lang, G. Aden-Buie, O. P. B. C. in rmarkdown/templates/xaringan/resources/default.css), P. Schratz, and S. Lopp 2018) package – an
alternative approach to preparing a slide deck. The xaringan
package allows customizing each slide entirely and previewing slides
dynamically in RStudio; moreover, the export of the slide deck
(natively in HTML) to a pixel-perfect PDF is fairly easy. As
xaringan also uses RMarkdown, it allows for reproducible results.
Jumping Rivers - Shiny Basics and Advanced Shiny by Roman
Popat (Jumping Rivers). The instructor Roman Popat from Jumping
Rivers conducted two workshops. In the first (Shiny Basics), he gave
an introduction to creating interactive visualizations of data using
Shiny. Here, participants learned how to use rmarkdown and
htmlwidgets; input and output bindings to interact with R data
structures; and input widgets and render functions to create
complete page layouts using shiny and shiny dashboard. The advanced
Shiny workshop explored how to add functionality to shiny apps using
javascript packages and code. In particular, it was showed how one
might deal with routines in a Shiny application that take a long
time to run and how to provide a good experience for simultaneous
users of an app. Finally, the instructor showed how to create a
standalone web server API to the R code and how to integrate the use
of it into a Shiny application using the plumber(Technology, LLC, J. Allen, F. van Dunné, S. Vandewoude, and S. Software (swagger-ui) 2018) package.
Constructing scales from survey questions by Tomasz Żółtak
(Educational Research Institute in Warsaw, Poland). Tomasz showed
how to create scales based on sets of categorical variables using
Categorical Exploratory/Confirmatory Factor Analysis (CEFA / CCFA)
and IRT models. He used models with bi-factor rotation to deal with
different forms of asking questions and corrected for differences in
a style of answering questions asked using a Likert scale. In
addition, it was showed how to correct self-assessment
knowledge/skill indicators using fake items.
Introduction to Deep Learning with Keras in R by Michał Maj
(Appsilon Data Science). The workshop covered many important aspects
of Deep Learning with the Keras in R, including sequential model
building, performing data ingestion and using pre-trained models and
performing fine-tuning. The keras(Allaire, F. Chollet, RStudio, Google, Y. Tang, D. Falbel, W. V. D. Bijl, and M. Studer 2018) R package
was explored.
5 Invited talks
The invited talks topics included domain knowledge from statistics,
computer science, natural sciences, and economics. The speakers list
presents as follows:
Tomasz Niedzielski (University of Wroclaw): Forecasting streamflow
using the HydroProg system developed in R,
Daria Szmurło (McKinsey & Company): The age of automation – What
does it mean for data scientists?,
Agnieszka Suchwałko (Wroclaw University of Technology): Project
evolution – from university to commerce,
Bernd Bischl (Ludwig-Maximilians-University of Munich): Machine
learning in R,
Artur Suchwałko (QuantUp): A business view on predictive modeling:
goals, assumptions, implementation,
Maciej Eder (Institute of Polish Language): New advances in text
mining: exploring word embeddings,
Thomas Petzoldt (Dresden University of Technology): Simulation of
dynamic models in R,
Leon Eyrich Jessen (Technical University of Denmark): Deep Learning
with R using TensorFlow.
6 Special Interest Groups
Three Special Interest Groups were organized to facilitate
topic-specific discussion between conference participants.
Diversity in Data Science, moderated by R-Ladies Warsaw, aimed
to discuss boosting the diversity of R community and inspire members
of affinity groups to pursue careers in data science.
The Career planning in data science, moderated by Artur
Suchwałko (QuantUp) and Marcin Kosiński (Why R? Foundation), gave
participants a chance to learn from experienced R enthusiasts about
their career paths.
Teaching of data science, moderated by Leon Eyrich Jessen
(Technical University of Denmark) and Stefan (Brandenburg Technical
University Cottbus-Senftenberg), gathered data science experts from
academia an industry to share their experiences and discuss
challenges and solutions in teaching different concepts of data
science.
7 Conference organizers
The quality of the scientific program of the conference was the
achievement of Marcin Kosiński, Alicja Gosiewska, Aleksandra Grudziąż,
Malte Grosser, Andrej-Nikolai Spiess, Przemysław Gagat, Joanna Szyda,
Paweł Mackiewicz, Bartosz Sękiewicz, Przemysław Biecek, Piotr Sobczyk,
Marta Karaś, Marcin Krzystanek, Marcin Łukaszewicz, Agnieszka Borsuk -
De Moor, Jarosław Chilimoniuk, Michał Maj, and Michał Kurtys. The
organization was in the hands of Michał Burdukiewicz (chair).
The organizers want to acknowledge R user groups from Berlin,
Copenhagen, Cracow, Hamburg, Munich, Poznan, Prague, Stockholm, TriCity,
Wroclaw, and Warsaw.
8 Acknowledgements
We would like to say thank you to all the sponsors, the University of
Wrocław, Wrocław Center of Biotechnology Consortium, the local
organizers of the pre-meetings, the mlr team, and student helpers.
9 Additional information
Why R? 2018 websitehttp://whyr.pl/2018Corporate sponsors:
McKinsey & Company, Wrocław Center for Biotechnology, KRUK S.A., iDash
s.c., R Consortium, WLOG Solutions, Jumping Rivers Ltd., RStudio, Inc.,
AnalyxGmbH, and Pearson IOKI.
Note
This article is converted from a Legacy LaTeX article using the
texor package.
The pdf version is the official version. To report a problem with the html,
refer to CONTRIBUTE on the R Journal homepage.
J. J. Allaire, F. Chollet, RStudio, Google, Y. Tang, D. Falbel, W. V. D. Bijl, and M. Studer. keras: R Interface to ’Keras’, Apr. . 2018. URL https://CRAN.R-project.org/package=keras.
P. Biecek and M. Kosinski. archivist: An R package for managing, recording and restoring data analysis results. Journal of Statistical Software 82 (11): doi10.18637/jss.v082.i11, 2017.
B. Bischl, M. Lang, L. Kotthoff, J. Schiffner, J. Richter, E. Studerus, G. Casalicchio, and Z. M. Jones. mlr: Machine learning in r. Journal of Machine Learning Research 17(170):, 2016. URL http://jmlr.org/papers/v17/15-066.html.
R. Bivand, C. Rundel, E. Pebesma, R. Stuetz, K. O. Hufthammer, P. Giraudoux, M. Davis, and S. Santilli. rgeos: Interface to Geometry Engine - Open Source (’GEOS’), June natexlabb. 2018. URL https://CRAN.R-project.org/package=rgeos.
R. Bivand, T. Keitt, B. Rowlingson, E. Pebesma, M. Sumner, R. Hijmans, E. Rouault, F. Warmerdam, J. Ooms, and C. Rundel. rgdal: Bindings for the ’Geospatial’ Data Abstraction Library, June natexlaba. 2018. URL https://CRAN.R-project.org/package=rgdal.
C. Gandrud. Reproducible Research with R and RStudio. Chapman; Hall/CRC July, 2013.
R. Gentleman and D. Temple Lang. Statistical Analyses and Reproducible Research. Journal of Computational; Graphical Statistics 16(1): Mar ISSN 1061-8600 1537-2715 doi10.1198/106186007X178663, 2007. URL http://www.tandfonline.com/doi/abs/10.1198/106186007X178663.
R. J. Hijmans, J. van Etten, J. Cheng, M. Mattiuzzi, M. Sumner, J. A. Greenberg, O. P. Lamigueiro, A. Bevan, E. B. Racine, A. Shortridge, and A. Ghosh. raster: Geographic Data Analysis and Modeling, Nov. . 2017. URL https://CRAN.R-project.org/package=raster.
Z. Liu and S. Pounds. An R package that automatically collects and archives details for reproducible computing. BMC Bioinformatics 15 (1): 138 May ISSN 1471-2105 doi10.1186/1471-2105-15-138, 2014. URL http://www.biomedcentral.com/1471-2105/15/138/abstract.
E. Pebesma, R. Bivand, E. Racine, M. Sumner, I. Cook, T. Keitt, R. Lovelace, H. Wickham, J. Ooms, and K. Müller. sf: Simple Features for R, May . 2018. URL https://CRAN.R-project.org/package=sf.
S. Rödiger, M. Burdukiewicz, K. A. Blagodatskikh, and P. Schierack. R as an Environment for the Reproducible Analysis of DNA Amplification Experiments. The R Journal 7 (2):, 2015. URL http://journal.r-project.org/archive/2015-1/RJ-2015-1.pdf.
J.-R. Roussel, D. A. R. the documentation), F. D. B. F. a. bugs improved catalog features), and A. S. M. I. lassnags). lidR: Airborne LiDAR Data Manipulation and Visualization for Forestry Applications, June . 2018. URL https://CRAN.R-project.org/package=lidR.
A. Sitko and P. Biecek. The Merging Path Plot: adaptive fusing of k-groups with likelihood-based model selection, . 2017. URL https://arxiv.org/abs/1709.04412.
M. Staniak and P. Biecek. Explanations of model predictions with live and breakDown packages. ArXiv e-prints Apr, 2018. URL https://arxiv.org/abs/1804.01955.
T. Technology, LLC, J. Allen, F. van Dunné, S. Vandewoude, and S. Software (swagger-ui). plumber: An API Generator for R, June . 2018. URL https://CRAN.R-project.org/package=plumber.
Y. Xie, C. T. Ekstrøm, D. Lang, G. Aden-Buie, O. P. B. C. in rmarkdown/templates/xaringan/resources/default.css), P. Schratz, and S. Lopp. xaringan: Presentation Ninja, Feb. . 2018. URL https://CRAN.R-project.org/package=xaringan.
References
Reuse
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
Citation
For attribution, please cite this work as
Burdukiewicz, et al., "Conference Report: Why R? 2018", The R Journal, 2018
BibTeX citation
@article{RJ-2018-2-whyR,
author = {Burdukiewicz, Michał and Karas, Marta and Jessen, Leon Eyrich and Kosiński, Marcin and Bischl, Bernd and Stefan, },
title = {Conference Report: Why R? 2018},
journal = {The R Journal},
year = {2018},
note = {https://rjournal.github.io/},
volume = {10},
issue = {2},
issn = {2073-4859},
pages = {572-578}
}