- מצגת על remake (יש גם קישור ל remake-tutorial ב- github).
- מצגת כללית על tidyverse ושלל נושאים קשורים.
למה אתה מספר לנו על זה רק עכשיו??
Talk 1 – Reproducible computational workflows with "remake"
Reproducible computational workflows are an important part of modern science: they enable you, your future self, or another person run your code and achieve the original results. When setting up a reproducible workflow, a tension between three conflicting targets occurs. Often, the only way to make sure that the code is reproducible actually requires running it from start to end. For code that runs longer than about 10 seconds (the average attention span of a human), rerunning all over again becomes impractical when working interactively. To avoid long run times, the code can be run piecemeal, which in turn complicates running the code from start to end. (We're taking for granted that the entire process must be scripted in the first place.)
The "remake" package by Rich FitzJohn is a solution to these problems. It allows defining a workflow as a set of targets (R objects or files, e.g. knitr reports) with dependencies (other targets or files). A target is computed by evaluating an R function. The system figures out automatically which targets need to be rebuilt based on recent changes to your code or your data (thus avoiding full recomputation), while maintaining the ability to develop code interactively. (Users familiar with "make" or other build systems will recognize the concept, but "remake" is much better suited for R projects than traditional "make".) A data analysis based on remake will always be fully reproducible, including those times when you realize, two days before delivery, that your data cleansing code has a crucial error.
Slides: https://krlmlr.github.io/remak
Tutorial: https://github.com/krlmlr/rema
Talk 2 – The "tidyverse" and "DBI": A peek under the hood
Over the past twenty years, R has evolved to a very stable and mature system for statistical computing and graphics. Thanks to its packaging system and thanks to CRAN, which now hosts over 10,000 packages, most practical data analyses can be implemented with little effort solely in R, from importing, loading from a database, web scraping or web API access to cleaning, transforming, exploring, modeling, and finally communicating with static or dynamic documents, web sites, or interactive dashboards. The tidyverse [1] is a coherent set of packages that aims at covering many of these tasks. This informal presentation showcases a choice of tidyverse (and other) packages for data manipulation and database access, and sketches some of their features and limitations.
Link to remake: https://github.com/richfitz/re
Speaker's Short Bio
Kirill has a computer science background with some exposure to applied statistical modeling, and enjoys contributing to the R ecosystem as a self-employed software engineer, data science consultant, and trainer. He improves, maintains, and reviews several R packages (dplyr+tibble, DBI+RSQLite+RMariaDB, styler, …), and applies these tools in practical settings to understand where they can be improved. His teaching portfolio includes a two- to three-day tidyverse course.
סייגים
כרגיל, אין לנו ביטוח או דברים דומים. עליכם יש אחריות מלאה לכל דבר הקשור בכם, החל בנזק לעצמכם, לאחרים, לרכוש וכן הלאה. אז בבקשה תהיו זהירים.
זה בשבע בבוקר או בערב
היי מור, שבע בערב!
בברכה,
טל