From ba444b8ff1224e6d3a65e507e2369b68c7a13356 Mon Sep 17 00:00:00 2001 From: lcolladotor Date: Wed, 28 Aug 2024 23:33:19 -0400 Subject: [PATCH] Update literate programming lecture, move Sweave, knitr, and distill to a former class material section --- .../index/execute-results/html.json | 5 +- posts/05-literate-programming/index.qmd | 221 ++++++++++-------- 2 files changed, 130 insertions(+), 96 deletions(-) diff --git a/_freeze/posts/05-literate-programming/index/execute-results/html.json b/_freeze/posts/05-literate-programming/index/execute-results/html.json index b6e5fb0..d466456 100644 --- a/_freeze/posts/05-literate-programming/index/execute-results/html.json +++ b/_freeze/posts/05-literate-programming/index/execute-results/html.json @@ -1,7 +1,8 @@ { - "hash": "ec7dc8dff335325b2e6798037f0540d4", + "hash": "2df83a1720d90681035fc7bec8155a2f", "result": { - "markdown": "---\ntitle: \"05 - Literate Statistical Programming\"\nauthor:\n - name: Leonardo Collado Torres\n url: http://lcolladotor.github.io/\n affiliations:\n - id: libd\n name: Lieber Institute for Brain Development\n url: https://libd.org/\n - id: jhsph\n name: Johns Hopkins Bloomberg School of Public Health Department of Biostatistics\n url: https://publichealth.jhu.edu/departments/biostatistics\ndescription: \"Introduction to literate statistical programming tools including R Markdown\"\nimage: https://github.com/allisonhorst/stats-illustrations/raw/main/rstats-artwork/rmarkdown_rockstar.png\ncategories: [module 1, week 1, R Markdown, programming]\nbibliography: my-refs.bib\n---\n\n\n*This lecture, as the rest of the course, is adapted from the version [Stephanie C. Hicks](https://www.stephaniehicks.com/) designed and maintained in 2021 and 2022. Check the recent changes to this file through the [GitHub history](https://github.com/lcolladotor/jhustatcomputing/commits/main/posts/05-literate-programming/index.qmd).*\n\n\n\n# Pre-lecture materials\n\n### Read ahead\n\n::: callout-note\n### Read ahead\n\n**Before class, you can prepare by reading the following materials:**\n\n1. \n2. \n:::\n\n### Acknowledgements\n\nMaterial for this lecture was borrowed and adopted from\n\n- \n- \n\n# Learning objectives\n\n::: callout-note\n# Learning objectives\n\n**At the end of this lesson you will:**\n\n- Be able to define literate programming\n- Recognize differences between available tools to for literate programming\n- Know how to efficiently work within RStudio for efficient literate programming\n- Create a R Markdown document\n:::\n\n# Introduction\n\nOne basic idea to make writing reproducible reports easier is what's known as *literate statistical programming* (or sometimes called [literate statistical practice](http://www.r-project.org/conferences/DSC-2001/Proceedings/Rossini.pdf)). This comes from the idea of [literate programming](https://en.wikipedia.org/wiki/Literate_programming) in the area of writing computer programs.\n\nThe idea is to **think of a report or a publication as a stream of text and code**.\n\n- The text is readable by people and the code is readable by computers.\n\n- The analysis is described in a series of text and code chunks.\n\n- Each kind of code chunk will do something like load some data or compute some results.\n\n- Each text chunk will relay something in a human readable language.\n\nThere might also be **presentation code** that formats tables and figures and there's article text that explains what's going on around all this code. This stream of text and code is a literate statistical program or a literate statistical analysis.\n\n### Weaving and Tangling\n\nLiterate programs by themselves are a bit difficult to work with, but they can be processed in two important ways.\n\nLiterate programs can be **weaved** to produce human readable documents like PDFs or HTML web pages, and they can **tangled** to produce machine-readable \"documents\", or in other words, machine readable code.\n\nThe basic idea behind literate programming in order to generate the different kinds of output you might need, **you only need a single source document**---you can weave and tangle to get the rest.\n\nIn order to use a system like this you need a documentational language, that's human readable, and you need a programming language that's machine readable (or can be compiled/interpreted into something that's machine readable).\n\n### Sweave\n\nOne of the original literate programming systems in R that was designed to do this was called Sweave. Sweave enables users to combine R code with a documentation program called $LaTeX$.\n\n**Sweave files ends a `.Rnw`** and have R code weaved through the document:\n\n``` \n<>=\ndata(airquality)\nplot(airquality$Ozone ~ airquality$Wind)\n@\n```\n\nOnce you have created your `.Rnw` file, Sweave will process the file, executing the R chunks and replacing them with output as appropriate before creating the PDF document.\n\nIt was originally developed by Fritz Leisch, who is a core member of R, and the code base is still maintained by R Core. The Sweave system comes with any installation of R.\n\nThere are many limitations to the original Sweave system.\n\n- One of the limitations is that it is **focused primarily on** $LaTeX$, which is not a documentation language that many people are familiar with.\n- Therefore, it **can be difficult to learn this type of markup language** if you're not already in a field that uses it regularly.\n- Sweave also **lacks a lot of features that people find useful** like caching, and multiple plots per page and mixing programming languages.\n\nInstead, folks have **moved towards using something called knitr**, which offers everything Sweave does, plus it extends it further.\n\n- With Sweave, additional tools are required for advanced operations, whereas knitr supports more internally. We'll discuss knitr below.\n\n### rmarkdown\n\nAnother choice for literate programming is to build documents based on [Markdown](https://en.wikipedia.org/wiki/Markdown) language. A markdown file is a plain text file that is typically given the extension `.md.`. The [`rmarkdown`](https://CRAN.R-project.org/package=rmarkdown) R package takes a R Markdown file (`.Rmd`) and weaves together R code chunks like this:\n\n```` \n```{r plot1, height=4, width=5, eval=FALSE, echo=TRUE}\ndata(airquality)\nplot(airquality$Ozone ~ airquality$Wind)\n```\n````\n\n::: callout-tip\nThe best resource for learning about R Markdown this by Yihui Xie, J. J. Allaire, and Garrett Grolemund:\n\n- \n\nThe R Markdown Cookbook by Yihui Xie, Christophe Dervieux, and Emily Riederer is really good too:\n\n- \n\nThe authors of the 2nd book describe the motivation for the 2nd book as:\n\n> \"However, we have received comments from our readers and publisher that it would be beneficial to provide more practical and relatively short examples to show the interesting and useful usage of R Markdown, because it can be daunting to find out how to achieve a certain task from the aforementioned reference book (put another way, that book is too dry to read). As a result, this cookbook was born.\"\n:::\n\nBecause this is lecture is built in a `.qmd` file (which is very similar to a `.Rmd` file), let's demonstrate how this work. I am going to change `eval=FALSE` to `eval=TRUE`.\n\n\n::: {.cell height='4' width='5'}\n\n```{.r .cell-code}\ndata(airquality)\nplot(airquality$Ozone ~ airquality$Wind)\n```\n\n::: {.cell-output-display}\n![](index_files/figure-html/plot2-1.png){width=672}\n:::\n:::\n\n\n::: callout-tip\n### Questions\n\n1. Why do we not see the back ticks \\`\\`\\` anymore in the code chunk above that made the plot?\n2. What do you think we should do if we want to have the code executed, but we want to hide the code that made it?\n:::\n\nBefore we leave this section, I find that there is quite a bit of terminology to understand the magic behind `rmarkdown` that can be confusing, so let's break it down:\n\n- [Pandoc](https://pandoc.org). Pandoc is a command line tool with no GUI that converts documents (e.g. from number of different markup formats to many other formats, such as .doc, .pdf etc). It is completely independent from R (but does come bundled with RStudio).\n- [Markdown](https://en.wikipedia.org/wiki/Markdown) (**markup language**). Markdown is a lightweight [markup language](https://en.wikipedia.org/wiki/Markup_language) with plain text formatting syntax designed so that it can be converted to HTML and many other formats. A markdown file is a plain text file that is typically given the extension `.md.` It is completely independent from R.\n- [`markdown`](https://CRAN.R-project.org/package=markdown) (**R package**). `markdown` is an R package which converts `.md` files into HTML. It is no longer recommended for use has been surpassed by [`rmarkdown`](https://CRAN.R-project.org/package=rmarkdown) (discussed below).\n- R Markdown (**markup language**). R Markdown is an extension of the markdown syntax. R Markdown files are plain text files that typically have the file extension `.Rmd`.\n- [`rmarkdown`](https://CRAN.R-project.org/package=rmarkdown) (**R package**). The R package `rmarkdown` is a library that uses pandoc to process and convert `.Rmd` files into a number of different formats. This core function is `rmarkdown::render()`. **Note**: this package only deals with the markdown language. If the input file is e.g. `.Rhtml` or `.Rnw`, then you need to use `knitr` prior to calling pandoc (see below).\n- [quarto](https://quarto.org/): Similar to `rmarkdown` but can be used outside R. It is not an R package! Input files have the `.qmd` file extension.\n\n::: callout-tip\nCheck out the R Markdown Quick Tour for more:\n\n- \n:::\n\n![Artwork by Allison Horst on RMarkdown](https://github.com/allisonhorst/stats-illustrations/raw/main/rstats-artwork/rmarkdown_rockstar.png){width=\"80%\"}\n\n### knitr\n\nOne of the alternative that has come up in recent times is something called `knitr`.\n\n- The `knitr` package for R takes a lot of these ideas of literate programming and updates and improves upon them.\n- `knitr` still uses R as its programming language, but it allows you to mix other programming languages in.\n- You can also use a variety of documentation languages now, such as LaTeX, markdown and HTML.\n- `knitr` was developed by Yihui Xie while he was a graduate student at Iowa State and it has become a very popular package for writing literate statistical programs.\n\nKnitr takes a plain text document with embedded code, executes the code and 'knits' the results back into the document.\n\nFor for example, it converts\n\n- An R Markdown (`.Rmd)` file into a standard markdown file (`.md`)\n- An `.Rnw` (Sweave) file into to `.tex` format.\n- An `.Rhtml` file into to `.html`.\n\nThe core function is `knitr::knit()` and by default this will look at the input document and try and guess what type it is e.g. `Rnw`, `Rmd` etc.\n\nThis core function performs three roles:\n\n- A **source parser**, which looks at the input document and detects which parts are code that the user wants to be evaluated.\n- A **code evaluator**, which evaluates this code\n- An **output renderer**, which writes the results of evaluation back to the document in a format which is interpretable by the raw output type. For instance, if the input file is an `.Rmd`, the output render marks up the output of code evaluation in `.md` format.\n\n\n::: {.cell layout-align=\"center\" preview='true'}\n::: {.cell-output-display}\n![Converting a Rmd file to many outputs using knitr and pandoc](https://d33wubrfki0l68.cloudfront.net/61d189fd9cdf955058415d3e1b28dd60e1bd7c9b/9791d/images/rmarkdownflow.png){fig-align='center' width=60%}\n:::\n:::\n\n\n\\[[Source](https://rmarkdown.rstudio.com/authoring_quick_tour.html)\\]\n\nAs seen in the figure above, from there pandoc is used to convert e.g. a `.md` file into many other types of file formats into a `.html`, etc.\n\nSo in summary:\n\n> \"R Markdown stands on the shoulders of knitr and Pandoc. The former executes the computer code embedded in Markdown, and converts R Markdown to Markdown. The latter renders Markdown to the output format you want (such as PDF, HTML, Word, and so on).\"\n\n\\[[Source](https://bookdown.org/yihui/rmarkdown/)\\]\n\n# Create and Knit Your First R Markdown Document\n\n\n\nWhen creating your first R Markdown document, in RStudio you can\n\n1. Go to File \\> New File \\> R Markdown...\n\n2. Feel free to edit the Title\n\n3. Make sure to select \"Default Output Format\" to be HTML\n\n4. Click \"OK\". RStudio creates the R Markdown document and places some boilerplate text in there just so you can see how things are setup.\n\n5. Click the \"Knit\" button (or go to File \\> Knit Document) to make sure you can create the HTML output\n\nIf you successfully knit your first R Markdown document, then congratulations!\n\n\n::: {.cell}\n::: {.cell-output-display}\n![Mission accomplished!](https://media.giphy.com/media/L4ZZNbDpOCfiX8uYSd/giphy.gif){width=60%}\n:::\n:::\n\n\n# Websites and Books in R Markdown\n\nNow that you are on the road to using R Markdown documents, it is important to know about other wonderful things you do with these documents. For example, let's say you have multiple `.Rmd` documents that you want to put together into a website, blog, book, etc.\n\nThere are primarily two ways to build multiple `.Rmd` documents together:\n\n1. [**blogdown**](https://bookdown.org/yihui/blogdown/) for building websites\n2. [**bookdown**](https://bookdown.org/yihui/bookdown/) for authoring books\n\nIn this section, we briefly introduce both packages, but it's worth mentioning that the [**rmarkdown** package also has a built-in site generator](https://bookdown.org/yihui/rmarkdown/rmarkdown-site.html) to build websites.\n\n### blogdown\n\n\n::: {.cell}\n::: {.cell-output-display}\n![blogdown logo](https://bookdown.org/yihui/blogdown/images/logo.png){width=30%}\n:::\n:::\n\n\n\\[[Source](https://bookdown.org/yihui/bookdown/images/logo.png)\\]\n\nThe `blogdown` R package is built on top of R Markdown, supports multi-page HTML output to write a blog post or a general page in an Rmd document, or a plain Markdown document.\n\n- These source documents (e.g. `.Rmd` or `.md`) are built into a static website (i.e. a bunch of static HTML files, images and CSS files).\n- Using this folder of files, it is very easy to publish it to any web server as a website.\n- Also, it is easy to maintain because it is only a single folder.\n\n::: callout-tip\nFor example, my personal website was built in blogdown:\n\n- \n\nOther really great examples can be found here:\n\n- \n:::\n\nOther advantages include the content likely being reproducible, easier to maintain, and easy to convert pages to e.g. PDF or other formats in the future if you do not want to convert to HTML files.\n\nBecause it is based on the Markdown syntax, it is easy to write technical documents, including math equations, insert figures or tables with captions, cross-reference with figure or table numbers, add citations, and present theorems or proofs.\n\nHere's a video you can watch of someone making a blogdown website.\n\n

\n\n\n\n

\n\n\\[[Source](https://www.youtube.com/watch?v=AADnslLpzJ4) on YouTube\\]\n\n### bookdown\n\n\n::: {.cell}\n::: {.cell-output-display}\n![book logo](https://bookdown.org/yihui/bookdown/images/logo.png){width=30%}\n:::\n:::\n\n\n\\[[Source](https://bookdown.org/yihui/bookdown/images/logo.png)\\]\n\nSimilar to `blogdown`, the `bookdown` R package is built on top of R Markdown, but also offers features like multi-page HTML output, numbering and cross-referencing figures/tables/sections/equations, inserting parts/appendices, and imported the GitBook style () to create elegant and appealing HTML book pages. Share\n\n::: callout-tip\nFor example, the previous version of this course was built in bookdown:\n\n- \n\nMy team documentation website is also built with `bookdown`:\n\n- \n\nAnother example is the [Tidyverse Skills for Data Science](https://jhudatascience.org/tidyversecourse/) book that the JHU Data Science Lab wrote. The github repo that contains all the `.Rmd` files can be found [here](https://github.com/jhudsl/tidyversecourse).\n\n- \n- \n:::\n\n**Note**: Even though the word \"book\" is in \"bookdown\", this package is not only for books. It really can be anything that consists of multiple `.Rmd` documents meant to be read in a linear sequence such as course dissertation/thesis, handouts, study notes, a software manual, a thesis, or even a diary.\n\n- \n\n### distill\n\nThere is another great way to build blogs or websites using the [distill for R Markdown](https://rstudio.github.io/distill/).\n\n- \n\nDistill for R Markdown combines the technical authoring features of the [Distill web framework](https://github.com/distillpub/template) (optimized for scientific and technical communication) with [R Markdown](https://rmarkdown.rstudio.com), enabling a fully reproducible workflow based on literate programming [@knuth1984].\n\nDistill articles include:\n\n- Reader-friendly typography that adapts well to mobile devices.\n- Features essential to technical writing like $LaTeX$ math, citations, and footnotes.\n- Flexible figure layout options (e.g. displaying figures at a larger width than the article text).\n- Attractively rendered tables with optional support for pagination.\n- Support for a wide variety of diagramming tools for illustrating concepts. The ability to incorporate JavaScript and D3-based interactive visualizations.\n- A variety of ways to publish articles, including support for publishing sets of articles as a Distill website or as a Distill blog.\n\nThe course website from 2021 was built in Distill for R Markdown:\n\n- Website: \n- Github: \n\nSome other cool things about distill is the use of footnotes and asides.\n\nFor example [^1]. The number of the footnote will be automatically generated.\n\n[^1]: This will become a hover-able footnote\n\nYou can also optionally include notes in the gutter of the article (immediately to the right of the article text). To do this use the aside tag.\n\n\n\nYou can also include figures in the gutter. Just enclose the code chunk which generates the figure in an aside tag\n\n# Tips and tricks in R Markdown in RStudio\n\nHere are shortcuts and tips on efficiently using RStudio to improve how you write code.\n\n### Run code\n\nIf you want to run a code chunk:\n\n``` \ncommand + Enter on Mac\nCtrl + Enter on Windows\n```\n\n### Insert a comment in R and R Markdown\n\nTo insert a comment:\n\n``` \ncommand + Shift + C on Mac\nCtrl + Shift + C on Windows\n```\n\nThis shortcut can be used both for:\n\n- R code when you want to comment your code. It will add a `#` at the beginning of the line\n- for text in R Markdown. It will add `` around the text\n\nNote that if you want to comment more than one line, select all the lines you want to comment then use the shortcut. If you want to uncomment a comment, apply the same shortcut.\n\n### Knit a R Markdown document\n\nYou can knit R Markdown documents by using this shortcut:\n\n``` \ncommand + Shift + K on Mac\nCtrl + Shift + K on Windows\n```\n\n### Code snippets\n\nCode snippets is usually a few characters long and is used as a shortcut to insert a common piece of code. You simply type a few characters then press `Tab` and it will complete your code with a larger code. `Tab` is then used again to navigate through the code where customization is required. For instance, if you type `fun` then press `Tab`, it will auto-complete the code with the required code to create a function:\n\n``` \nname <- function(variables) {\n \n}\n```\n\nPressing `Tab` again will jump through the placeholders for you to edit it. So you can first edit the name of the function, then the variables and finally the code inside the function (try by yourself!).\n\nThere are many code snippets by default in RStudio. Here are the code snippets you might want to use:\n\n- `lib` to call `library()`\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(package)\n```\n:::\n\n\n- `mat` to create a matrix\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmatrix(data, nrow = rows, ncol = cols)\n```\n:::\n\n\n- `if`, `el`, and `ei` to create conditional expressions such as `if() {}`, `else {}` and `else if () {}`\n\n\n::: {.cell}\n\n```{.r .cell-code}\nif (condition) {\n ## Case 1\n} else if (condition) {\n ## Case 2\n} else if (condition) {\n ## Case 3\n}\n```\n:::\n\n\n- `fun` to create a function\n\n\n::: {.cell}\n\n```{.r .cell-code}\nname <- function(variables) {\n\n}\n```\n:::\n\n\n- `for` to create for loops\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (variable in vector) {\n\n}\n```\n:::\n\n\n- `ts` to insert a comment with the current date and time (useful if you have very long code and share it with others so they see when it has been edited)\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Tue Jan 21 20:20:14 2020 ------------------------------\n```\n:::\n\n\nYou can see all default code snippets and add yours by clicking on Tools \\> Global Options... \\> Code (left sidebar) \\> Edit Snippets...\n\n### Ordered list in R Markdown\n\nIn R Markdown, when creating an ordered list such as this one:\n\n1. Item 1\n2. Item 2\n3. Item 3\n\nInstead of bothering with the numbers and typing\n\n``` \n1. Item 1\n2. Item 2\n3. Item 3\n```\n\nyou can simply type\n\n``` \n1. Item 1\n1. Item 2\n1. Item 3\n```\n\nfor the exact same result (try it yourself or check the code of this article!). This way you do not need to bother which number is next when creating a new item.\n\nTo go even further, any numeric will actually render the same result as long as the first item is the number you want to start from. For example, you could type:\n\n``` \n1. Item 1\n7. Item 2\n3. Item 3\n```\n\nwhich renders\n\n1. Item 1\n2. Item 2\n3. Item 3\n\nHowever, I suggest always using the number you want to start from for all items because if you move one item at the top, the list will start with this new number. For instance, if we move `7. Item 2` from the previous list at the top, the list becomes:\n\n``` \n7. Item 2\n1. Item 1\n3. Item 3\n```\n\nwhich incorrectly renders\n\n7. Item 2\n8. Item 1\n9. Item 3\n\n### New code chunk in R Markdown\n\nWhen editing R Markdown documents, you will need to insert a new R code chunk many times. The following shortcuts will make your life easier:\n\n``` \ncommand + option + I on Mac (or command + alt + I depending on your keyboard)\nCtrl + ALT + I on Windows\n```\n\n### Reformat code\n\nA clear and readable code is always easier and faster to read (and look more professional when sharing it to collaborators). To automatically apply the most common coding guidelines such as white spaces, indents, etc., use:\n\n``` \ncmd + Shift + A on Mac\nCtrl + Shift + A on Windows\n```\n\nSo for example the following code which does not respect the guidelines (and which is not easy to read):\n\n``` \n1+1\n for(i in 1:10){if(!i%%2){next}\nprint(i)\n }\n```\n\nbecomes much more neat and readable:\n\n``` \n1 + 1\nfor (i in 1:10) {\n if (!i %% 2) {\n next\n }\n print(i)\n}\n```\n\nI also like to use the [`styler`: Non-invasive pretty printing of R code](https://github.com/r-lib/styler) package to automatically style my R code. I also use to style my code in a way that it is compatible with Bioconductor's code standards.\n\n\n::: {.cell}\n\n```{.r .cell-code}\n## Install styler for automatically styling scripts\ninstall.packages(\"styler\")\n\n## Install biocthis\nif (!require(\"BiocManager\", quietly = TRUE)) {\n install.packages(\"BiocManager\")\n}\n\nBiocManager::install(\"biocthis\")\n```\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\n## Example code for styling all files that end with .qmd\ncat(readLines(here::here(\"scripts\", \"auto_style.R\")))\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nstyler::style_dir( here::here(), transformers = biocthis::bioc_style(), filetype = \"qmd\" )\n```\n:::\n:::\n\n\nYou might want to edit your `` ~/.Rprofile with `usethis::edit_r_profile()` `` and add this line to it:\n\n\n::: {.cell}\n\n```{.r .cell-code}\n## For the styler addin\n# Affects the output of: styler:::get_addins_style_transformer_name()\n# https://github.com/r-lib/styler/blob/acfb42acc2e558e7b57ef133f1470df78b5093fd/R/addins.R#L183\noptions(\"styler.addins_style_transformer\" = \"biocthis::bioc_style()\")\n```\n:::\n\n\nFor more information I have on my `~/.Rprofile` file check .\n\n### RStudio addins\n\nRStudio addins are extensions which provide a simple mechanism for executing advanced R functions from within RStudio. In simpler words, when executing an addin (by clicking a button in the Addins menu), the corresponding code is executed without you having to write the code. RStudio addins have the advantage that they allow you to execute complex and advanced code much more easily than if you would have to write it yourself.\n\n::: callout-tip\n**For more information about RStudio addins, check out**:\n\n- \n- \n:::\n\n### Others\n\nSimilar to many other programs, you can also use:\n\n- `command + Shift + N` on Mac and `Ctrl + Shift + N` on Windows to open a new R Script\n- `command + S` on Mac and `Ctrl + S` on Windows to save your current script or R Markdown document\n\nCheck out Tools --\\> Keyboard Shortcuts Help to see a long list of these shortcuts.\n\n# Post-lecture materials\n\n### Final Questions\n\nHere are some post-lecture questions to help you think about the material discussed.\n\n::: questions\n### Questions\n\n1. What is literate programming?\n\n2. What was the first literate statistical programming tool to weave together a statistical language (R) with a markup language (LaTeX)?\n\n3. What is `knitr` and how is different than other literate statistical programming tools?\n\n4. Where can you find a list of other commands that help make your code writing more efficient in RStudio?\n:::\n\n### Additional Resources\n\n::: callout-tip\n- [RMarkdown Tips and Tricks](https://indrajeetpatil.github.io/RmarkdownTips/) by Indrajeet Patil\n- \n- \n:::\n\n# R session information\n\n\n::: {.cell}\n\n```{.r .cell-code}\noptions(width = 120)\nsessioninfo::session_info()\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────\n setting value\n version R version 4.3.1 (2023-06-16)\n os macOS Ventura 13.5\n system aarch64, darwin20\n ui X11\n language (EN)\n collate en_US.UTF-8\n ctype en_US.UTF-8\n tz America/New_York\n date 2023-09-12\n pandoc 3.1.5 @ /opt/homebrew/bin/ (via rmarkdown)\n\n─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────\n package * version date (UTC) lib source\n cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.0)\n colorout 1.2-2 2023-05-06 [1] Github (jalvesaq/colorout@79931fd)\n digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.0)\n evaluate 0.21 2023-05-05 [1] CRAN (R 4.3.0)\n fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.0)\n here 1.0.1 2020-12-13 [1] CRAN (R 4.3.0)\n htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.0)\n htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.0)\n jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.3.0)\n knitr 1.43 2023-05-25 [1] CRAN (R 4.3.0)\n rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.0)\n rmarkdown 2.24 2023-08-14 [1] CRAN (R 4.3.1)\n rprojroot 2.0.3 2022-04-02 [1] CRAN (R 4.3.0)\n rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0)\n sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)\n xfun 0.40 2023-08-09 [1] CRAN (R 4.3.0)\n yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0)\n\n [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library\n\n──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\n```\n:::\n:::\n", + "engine": "knitr", + "markdown": "---\ntitle: \"05 - Literate Statistical Programming\"\nauthor:\n - name: Leonardo Collado Torres\n url: http://lcolladotor.github.io/\n affiliations:\n - id: libd\n name: Lieber Institute for Brain Development\n url: https://libd.org/\n - id: jhsph\n name: Johns Hopkins Bloomberg School of Public Health Department of Biostatistics\n url: https://publichealth.jhu.edu/departments/biostatistics\ndescription: \"Introduction to literate statistical programming tools including R Markdown\"\nimage: https://github.com/allisonhorst/stats-illustrations/raw/main/rstats-artwork/rmarkdown_rockstar.png\ncategories: [module 1, week 1, R Markdown, programming]\nbibliography: my-refs.bib\n---\n\n\n*This lecture, as the rest of the course, is adapted from the version [Stephanie C. Hicks](https://www.stephaniehicks.com/) designed and maintained in 2021 and 2022. Check the recent changes to this file through the [GitHub history](https://github.com/lcolladotor/jhustatcomputing/commits/main/posts/05-literate-programming/index.qmd).*\n\n\n\n# Pre-lecture materials\n\n### Read ahead\n\n::: callout-note\n### Read ahead\n\n**Before class, you can prepare by reading the following materials:**\n\n1. \n2. \n:::\n\n### Acknowledgements\n\nMaterial for this lecture was borrowed and adopted from\n\n- \n- \n\n# Learning objectives\n\n::: callout-note\n# Learning objectives\n\n**At the end of this lesson you will:**\n\n- Be able to define literate programming\n- Recognize differences between available tools to for literate programming\n- Know how to efficiently work within RStudio for efficient literate programming\n- Create a R Markdown document\n:::\n\n# Introduction\n\nOne basic idea to make writing reproducible reports easier is what's known as *literate statistical programming* (or sometimes called [literate statistical practice](http://www.r-project.org/conferences/DSC-2001/Proceedings/Rossini.pdf)). This comes from the idea of [literate programming](https://en.wikipedia.org/wiki/Literate_programming) in the area of writing computer programs.\n\nThe idea is to **think of a report or a publication as a stream of text and code**.\n\n- The text is readable by people and the code is readable by computers.\n\n- The analysis is described in a series of text and code chunks.\n\n- Each kind of code chunk will do something like load some data or compute some results.\n\n- Each text chunk will relay something in a human readable language.\n\nThere might also be **presentation code** that formats tables and figures and there's article text that explains what's going on around all this code. This stream of text and code is a literate statistical program or a literate statistical analysis.\n\n### Weaving and Tangling\n\nLiterate programs by themselves are a bit difficult to work with, but they can be processed in two important ways.\n\nLiterate programs can be **weaved** to produce human readable documents like PDFs or HTML web pages, and they can **tangled** to produce machine-readable \"documents\", or in other words, machine readable code.\n\nThe basic idea behind literate programming in order to generate the different kinds of output you might need, **you only need a single source document**---you can weave and tangle to get the rest.\n\nIn order to use a system like this you need a documentational language, that's human readable, and you need a programming language that's machine readable (or can be compiled/interpreted into something that's machine readable).\n\n### rmarkdown\n\nToday's main choice for literate programming is to build documents based on [Markdown](https://en.wikipedia.org/wiki/Markdown) language. A markdown file is a plain text file that is typically given the extension `.md.`. The [`rmarkdown`](https://CRAN.R-project.org/package=rmarkdown) R package takes a R Markdown file (`.Rmd`) and weaves together R code chunks like this:\n\n```` \n```{r plot1, height=4, width=5, eval=FALSE, echo=TRUE}\ndata(airquality)\nplot(airquality$Ozone ~ airquality$Wind)\n```\n````\n\n::: callout-tip\nThe best resource for learning about R Markdown this by Yihui Xie, J. J. Allaire, and Garrett Grolemund:\n\n- \n\nThe R Markdown Cookbook by Yihui Xie, Christophe Dervieux, and Emily Riederer is really good too:\n\n- \n\nThe authors of the 2nd book describe the motivation for the 2nd book as:\n\n> \"However, we have received comments from our readers and publisher that it would be beneficial to provide more practical and relatively short examples to show the interesting and useful usage of R Markdown, because it can be daunting to find out how to achieve a certain task from the aforementioned reference book (put another way, that book is too dry to read). As a result, this cookbook was born.\"\n:::\n\nBecause this is lecture is built in a `.qmd` file (which is very similar to a `.Rmd` file), let's demonstrate how this work. I am going to change `eval=FALSE` to `eval=TRUE`.\n\n\n::: {.cell height='4' width='5'}\n\n```{.r .cell-code}\ndata(airquality)\nplot(airquality$Ozone ~ airquality$Wind)\n```\n\n::: {.cell-output-display}\n![](index_files/figure-html/plot2-1.png){width=672}\n:::\n:::\n\n\n::: callout-tip\n### Questions\n\n1. Why do we not see the back ticks \\`\\`\\` anymore in the code chunk above that made the plot?\n2. What do you think we should do if we want to have the code executed, but we want to hide the code that made it?\n:::\n\nBefore we leave this section, I find that there is quite a bit of terminology to understand the magic behind `rmarkdown` that can be confusing, so let's break it down:\n\n- [Pandoc](https://pandoc.org). Pandoc is a command line tool with no GUI that converts documents (e.g. from number of different markup formats to many other formats, such as .doc, .pdf etc). It is completely independent from R (but does come bundled with RStudio).\n- [Markdown](https://en.wikipedia.org/wiki/Markdown) (**markup language**). Markdown is a lightweight [markup language](https://en.wikipedia.org/wiki/Markup_language) with plain text formatting syntax designed so that it can be converted to HTML and many other formats. A markdown file is a plain text file that is typically given the extension `.md.` It is completely independent from R.\n- [`markdown`](https://CRAN.R-project.org/package=markdown) (**R package**). `markdown` is an R package which converts `.md` files into HTML. It is no longer recommended for use has been surpassed by [`rmarkdown`](https://CRAN.R-project.org/package=rmarkdown) (discussed below).\n- R Markdown (**markup language**). R Markdown is an extension of the markdown syntax. R Markdown files are plain text files that typically have the file extension `.Rmd`.\n- [`rmarkdown`](https://CRAN.R-project.org/package=rmarkdown) (**R package**). The R package `rmarkdown` is a library that uses pandoc to process and convert `.Rmd` files into a number of different formats. This core function is `rmarkdown::render()`. **Note**: this package only deals with the markdown language. If the input file is e.g. `.Rhtml` or `.Rnw`, then you need to use `knitr` prior to calling pandoc (see below).\n- [quarto](https://quarto.org/): Similar to `rmarkdown` but can be used outside R. It is not an R package! Input files have the `.qmd` file extension.\n\n::: callout-tip\nCheck out the R Markdown Quick Tour for more:\n\n- \n:::\n\n![Artwork by Allison Horst on RMarkdown](https://github.com/allisonhorst/stats-illustrations/raw/main/rstats-artwork/rmarkdown_rockstar.png){width=\"80%\"}\n\n# Create and Knit Your First R Markdown Document\n\n\n\nWhen creating your first R Markdown document, in RStudio you can\n\n1. Go to File \\> New File \\> R Markdown...\n\n2. Feel free to edit the Title\n\n3. Make sure to select \"Default Output Format\" to be HTML\n\n4. Click \"OK\". RStudio creates the R Markdown document and places some boilerplate text in there just so you can see how things are setup.\n\n5. Click the \"Knit\" button (or go to File \\> Knit Document) to make sure you can create the HTML output\n\nIf you successfully knit your first R Markdown document, then congratulations!\n\n\n::: {.cell}\n::: {.cell-output-display}\n![Mission accomplished!](https://media.giphy.com/media/L4ZZNbDpOCfiX8uYSd/giphy.gif){width=60%}\n:::\n:::\n\n\n# Websites and Books in R Markdown\n\nNow that you are on the road to using R Markdown documents, it is important to know about other wonderful things you do with these documents. For example, let's say you have multiple `.Rmd` documents that you want to put together into a website, blog, book, etc.\n\nThere are primarily two ways to build multiple `.Rmd` documents together:\n\n1. [**blogdown**](https://bookdown.org/yihui/blogdown/) for building websites\n2. [**bookdown**](https://bookdown.org/yihui/bookdown/) for authoring books\n\nIn this section, we briefly introduce both packages, but it's worth mentioning that the [**rmarkdown** package also has a built-in site generator](https://bookdown.org/yihui/rmarkdown/rmarkdown-site.html) to build websites.\n\n### blogdown\n\n\n::: {.cell}\n::: {.cell-output-display}\n![blogdown logo](https://bookdown.org/yihui/blogdown/images/logo.png){width=30%}\n:::\n:::\n\n\n\\[[Source](https://bookdown.org/yihui/bookdown/images/logo.png)\\]\n\nThe `blogdown` R package is built on top of R Markdown, supports multi-page HTML output to write a blog post or a general page in an Rmd document, or a plain Markdown document.\n\n- These source documents (e.g. `.Rmd` or `.md`) are built into a static website (i.e. a bunch of static HTML files, images and CSS files).\n- Using this folder of files, it is very easy to publish it to any web server as a website.\n- Also, it is easy to maintain because it is only a single folder.\n\n::: callout-tip\nFor example, my personal website was built in blogdown:\n\n- with source code at \n\nOther really great examples can be found here:\n\n- \n:::\n\nOther advantages include the content likely being reproducible, easier to maintain, and easy to convert pages to e.g. PDF or other formats in the future if you do not want to convert to HTML files.\n\nBecause it is based on the Markdown syntax, it is easy to write technical documents, including math equations, insert figures or tables with captions, cross-reference with figure or table numbers, add citations, and present theorems or proofs.\n\nHere's a video you can watch of someone making a blogdown website.\n\n

\n\n\n\n

\n\n\\[[Source](https://www.youtube.com/watch?v=AADnslLpzJ4) on YouTube\\]\n\n### bookdown\n\n\n::: {.cell}\n::: {.cell-output-display}\n![book logo](https://bookdown.org/yihui/bookdown/images/logo.png){width=30%}\n:::\n:::\n\n\n\\[[Source](https://bookdown.org/yihui/bookdown/images/logo.png)\\]\n\nSimilar to `blogdown`, the `bookdown` R package is built on top of R Markdown, but also offers features like multi-page HTML output, numbering and cross-referencing figures/tables/sections/equations, inserting parts/appendices, and imported the GitBook style () to create elegant and appealing HTML book pages. Share\n\n::: callout-tip\nFor example, the previous version of this course was built in bookdown:\n\n- \n\nMy team documentation website is also built with `bookdown`:\n\n- \n\nAnother example is the [Tidyverse Skills for Data Science](https://jhudatascience.org/tidyversecourse/) book that the JHU Data Science Lab wrote. The github repo that contains all the `.Rmd` files can be found [here](https://github.com/jhudsl/tidyversecourse).\n\n- \n- \n:::\n\n**Note**: Even though the word \"book\" is in \"bookdown\", this package is not only for books. It really can be anything that consists of multiple `.Rmd` documents meant to be read in a linear sequence such as course dissertation/thesis, handouts, study notes, a software manual, a thesis, or even a diary.\n\n- \n\n### quarto\n\nquarto is newer and is described in further detail at . It's what was was used for making this course website. As an alternative to using `postcards` for [Project 0](../../projects/project-0/), you could use `quarto` instead following the instructions from . As examples check:\n\n- and \n\n- and \n\n
\n\n

\n\nI’m super excited to share with you my personal blog. Here I’ll be posting about the math and statistics behind frequently used bioinformatic tools to analyze omics data, hopefully contributing to increasing our understanding around them.

Go check 👉🏼 https://t.co/W5PsRBCCYu pic.twitter.com/sRgcKFxsZy\n\n

\n\n— Daianna (@daianna_glez) July 21, 2024\n\n
\n\n\n```{=html}\n\n```\n\n- and \n\n# Tips and tricks in R Markdown in RStudio\n\nHere are shortcuts and tips on efficiently using RStudio to improve how you write code.\n\n### Run code\n\nIf you want to run a code chunk:\n\n``` \ncommand + Enter on Mac\nCtrl + Enter on Windows\n```\n\n### Insert a comment in R and R Markdown\n\nTo insert a comment:\n\n``` \ncommand + Shift + C on Mac\nCtrl + Shift + C on Windows\n```\n\nThis shortcut can be used both for:\n\n- R code when you want to comment your code. It will add a `#` at the beginning of the line\n- for text in R Markdown. It will add `` around the text\n\nNote that if you want to comment more than one line, select all the lines you want to comment then use the shortcut. If you want to uncomment a comment, apply the same shortcut.\n\n### Knit a R Markdown document\n\nYou can knit R Markdown documents by using this shortcut:\n\n``` \ncommand + Shift + K on Mac\nCtrl + Shift + K on Windows\n```\n\n### Code snippets\n\nCode snippets is usually a few characters long and is used as a shortcut to insert a common piece of code. You simply type a few characters then press `Tab` and it will complete your code with a larger code. `Tab` is then used again to navigate through the code where customization is required. For instance, if you type `fun` then press `Tab`, it will auto-complete the code with the required code to create a function:\n\n``` \nname <- function(variables) {\n \n}\n```\n\nPressing `Tab` again will jump through the placeholders for you to edit it. So you can first edit the name of the function, then the variables and finally the code inside the function (try by yourself!).\n\nThere are many code snippets by default in RStudio. Here are the code snippets you might want to use:\n\n- `lib` to call `library()`\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(package)\n```\n:::\n\n\n- `mat` to create a matrix\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmatrix(data, nrow = rows, ncol = cols)\n```\n:::\n\n\n- `if`, `el`, and `ei` to create conditional expressions such as `if() {}`, `else {}` and `else if () {}`\n\n\n::: {.cell}\n\n```{.r .cell-code}\nif (condition) {\n ## Case 1\n} else if (condition) {\n ## Case 2\n} else if (condition) {\n ## Case 3\n}\n```\n:::\n\n\n- `fun` to create a function\n\n\n::: {.cell}\n\n```{.r .cell-code}\nname <- function(variables) {\n\n}\n```\n:::\n\n\n- `for` to create for loops\n\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (variable in vector) {\n\n}\n```\n:::\n\n\n- `ts` to insert a comment with the current date and time (useful if you have very long code and share it with others so they see when it has been edited)\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Tue Jan 21 20:20:14 2020 ------------------------------\n```\n:::\n\n\nYou can see all default code snippets and add yours by clicking on Tools \\> Global Options... \\> Code (left sidebar) \\> Edit Snippets...\n\n### Ordered list in R Markdown\n\nIn R Markdown, when creating an ordered list such as this one:\n\n1. Item 1\n2. Item 2\n3. Item 3\n\nInstead of bothering with the numbers and typing\n\n``` \n1. Item 1\n2. Item 2\n3. Item 3\n```\n\nyou can simply type\n\n``` \n1. Item 1\n1. Item 2\n1. Item 3\n```\n\nfor the exact same result (try it yourself or check the code of this article!). This way you do not need to bother which number is next when creating a new item.\n\nTo go even further, any numeric will actually render the same result as long as the first item is the number you want to start from. For example, you could type:\n\n``` \n1. Item 1\n7. Item 2\n3. Item 3\n```\n\nwhich renders\n\n1. Item 1\n2. Item 2\n3. Item 3\n\nHowever, I suggest always using the number you want to start from for all items because if you move one item at the top, the list will start with this new number. For instance, if we move `7. Item 2` from the previous list at the top, the list becomes:\n\n``` \n7. Item 2\n1. Item 1\n3. Item 3\n```\n\nwhich incorrectly renders\n\n7. Item 2\n8. Item 1\n9. Item 3\n\n### New code chunk in R Markdown\n\nWhen editing R Markdown documents, you will need to insert a new R code chunk many times. The following shortcuts will make your life easier:\n\n``` \ncommand + option + I on Mac (or command + alt + I depending on your keyboard)\nCtrl + ALT + I on Windows\n```\n\n### Reformat code\n\nA clear and readable code is always easier and faster to read (and look more professional when sharing it to collaborators). To automatically apply the most common coding guidelines such as white spaces, indents, etc., use:\n\n``` \ncmd + Shift + A on Mac\nCtrl + Shift + A on Windows\n```\n\nSo for example the following code which does not respect the guidelines (and which is not easy to read):\n\n``` \n1+1\n for(i in 1:10){if(!i%%2){next}\nprint(i)\n }\n```\n\nbecomes much more neat and readable:\n\n``` \n1 + 1\nfor (i in 1:10) {\n if (!i %% 2) {\n next\n }\n print(i)\n}\n```\n\nI also like to use the [`styler`: Non-invasive pretty printing of R code](https://github.com/r-lib/styler) package to automatically style my R code. I also use to style my code in a way that it is compatible with Bioconductor's code standards.\n\n\n::: {.cell}\n\n```{.r .cell-code}\n## Install styler for automatically styling scripts\ninstall.packages(\"styler\")\n\n## Install biocthis\nif (!require(\"BiocManager\", quietly = TRUE)) {\n install.packages(\"BiocManager\")\n}\n\nBiocManager::install(\"biocthis\")\n```\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\n## Example code for styling all files that end with .qmd\ncat(readLines(here::here(\"scripts\", \"auto_style.R\")))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\nstyler::style_dir( here::here(), transformers = biocthis::bioc_style(), filetype = \"qmd\" )\n```\n\n\n:::\n:::\n\n\nYou might want to edit your `` ~/.Rprofile with `usethis::edit_r_profile()` `` and add this line to it:\n\n\n::: {.cell}\n\n```{.r .cell-code}\n## For the styler addin\n# Affects the output of: styler:::get_addins_style_transformer_name()\n# https://github.com/r-lib/styler/blob/acfb42acc2e558e7b57ef133f1470df78b5093fd/R/addins.R#L183\noptions(\"styler.addins_style_transformer\" = \"biocthis::bioc_style()\")\n```\n:::\n\n\nFor more information I have on my `~/.Rprofile` file check .\n\n### RStudio addins\n\nRStudio addins are extensions which provide a simple mechanism for executing advanced R functions from within RStudio. In simpler words, when executing an addin (by clicking a button in the Addins menu), the corresponding code is executed without you having to write the code. RStudio addins have the advantage that they allow you to execute complex and advanced code much more easily than if you would have to write it yourself.\n\n::: callout-tip\n**For more information about RStudio addins, check out**:\n\n- \n- \n:::\n\n### Others\n\nSimilar to many other programs, you can also use:\n\n- `command + Shift + N` on Mac and `Ctrl + Shift + N` on Windows to open a new R Script\n- `command + S` on Mac and `Ctrl + S` on Windows to save your current script or R Markdown document\n\nCheck out Tools --\\> Keyboard Shortcuts Help to see a long list of these shortcuts.\n\n# Post-lecture materials\n\n### Final Questions\n\nHere are some post-lecture questions to help you think about the material discussed.\n\n::: questions\n### Questions\n\n1. What is literate programming?\n\n2. Where can you find a list of other commands that help make your code writing more efficient in RStudio?\n\n3. What type of output files can you make with `rmarkdown`?\n\n4. If you make a website with R (`html` files), where would you host them for free?\n\n5. What was the first literate statistical programming tool to weave together a statistical language (R) with a markup language (LaTeX)? (see former class material)\n\n6. What is `knitr` and how is different than other literate statistical programming tools? (see former class material)\n:::\n\n### Additional Resources\n\n::: callout-tip\n- [RMarkdown Tips and Tricks](https://indrajeetpatil.github.io/RmarkdownTips/) by Indrajeet Patil\n- \n- \n:::\n\n# R session information\n\n\n::: {.cell}\n\n```{.r .cell-code}\noptions(width = 120)\nsessioninfo::session_info()\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────\n setting value\n version R version 4.4.1 (2024-06-14)\n os macOS Sonoma 14.5\n system aarch64, darwin20\n ui X11\n language (EN)\n collate en_US.UTF-8\n ctype en_US.UTF-8\n tz America/New_York\n date 2024-08-28\n pandoc 3.2 @ /opt/homebrew/bin/ (via rmarkdown)\n\n─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────\n package * version date (UTC) lib source\n cli 3.6.3 2024-06-21 [1] CRAN (R 4.4.0)\n colorout * 1.3-0.2 2024-05-03 [1] Github (jalvesaq/colorout@c6113a2)\n digest 0.6.36 2024-06-23 [1] CRAN (R 4.4.0)\n evaluate 0.24.0 2024-06-10 [1] CRAN (R 4.4.0)\n fastmap 1.2.0 2024-05-15 [1] CRAN (R 4.4.0)\n here 1.0.1 2020-12-13 [1] CRAN (R 4.4.0)\n htmltools 0.5.8.1 2024-04-04 [1] CRAN (R 4.4.0)\n htmlwidgets 1.6.4 2023-12-06 [1] CRAN (R 4.4.0)\n jsonlite 1.8.8 2023-12-04 [1] CRAN (R 4.4.0)\n knitr 1.48 2024-07-07 [1] CRAN (R 4.4.0)\n rlang 1.1.4 2024-06-04 [1] CRAN (R 4.4.0)\n rmarkdown 2.27 2024-05-17 [1] CRAN (R 4.4.0)\n rprojroot 2.0.4 2023-11-05 [1] CRAN (R 4.4.0)\n rstudioapi 0.16.0 2024-03-24 [1] CRAN (R 4.4.0)\n sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.4.0)\n xfun 0.46 2024-07-18 [1] CRAN (R 4.4.0)\n yaml 2.3.10 2024-07-26 [1] CRAN (R 4.4.0)\n\n [1] /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library\n\n──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────\n```\n\n\n:::\n:::\n\n\n# Former class material\n\nThis section contains older class material, that can be useful if you want to learn more about other valid R alternatives to the material we covered in class.\n\n### Sweave\n\nOne of the original literate programming systems in R that was designed to do this was called Sweave. Sweave enables users to combine R code with a documentation program called $LaTeX$.\n\n**Sweave files ends a `.Rnw`** and have R code weaved through the document:\n\n``` \n<>=\ndata(airquality)\nplot(airquality$Ozone ~ airquality$Wind)\n@\n```\n\nOnce you have created your `.Rnw` file, Sweave will process the file, executing the R chunks and replacing them with output as appropriate before creating the PDF document.\n\nIt was originally developed by Fritz Leisch, who is a core member of R, and the code base is still maintained by R Core. The Sweave system comes with any installation of R.\n\nThere are many limitations to the original Sweave system.\n\n- One of the limitations is that it is **focused primarily on** $LaTeX$, which is not a documentation language that many people are familiar with.\n- Therefore, it **can be difficult to learn this type of markup language** if you're not already in a field that uses it regularly.\n- Sweave also **lacks a lot of features that people find useful** like caching, and multiple plots per page and mixing programming languages.\n\nInstead, folks have **moved towards using something called knitr**, which offers everything Sweave does, plus it extends it further.\n\n- With Sweave, additional tools are required for advanced operations, whereas knitr supports more internally. We'll discuss knitr below.\n\n### knitr\n\nOne of the alternative that has come up in recent times (it was[first released in 2012](https://en.wikipedia.org/wiki/Knitr); nowadays most people use `rmarkdown` directly) is something called `knitr`.\n\n- The `knitr` package for R takes a lot of these ideas of literate programming and updates and improves upon them.\n- `knitr` still uses R as its programming language, but it allows you to mix other programming languages in.\n- You can also use a variety of documentation languages now, such as LaTeX, markdown and HTML.\n- `knitr` was developed by Yihui Xie while he was a graduate student at Iowa State and it has become a very popular package for writing literate statistical programs.\n\nKnitr takes a plain text document with embedded code, executes the code and 'knits' the results back into the document.\n\nFor for example, it converts\n\n- An R Markdown (`.Rmd)` file into a standard markdown file (`.md`)\n- An `.Rnw` (Sweave) file into to `.tex` format.\n- An `.Rhtml` file into to `.html`.\n\nThe core function is `knitr::knit()` and by default this will look at the input document and try and guess what type it is e.g. `Rnw`, `Rmd` etc.\n\nThis core function performs three roles:\n\n- A **source parser**, which looks at the input document and detects which parts are code that the user wants to be evaluated.\n- A **code evaluator**, which evaluates this code\n- An **output renderer**, which writes the results of evaluation back to the document in a format which is interpretable by the raw output type. For instance, if the input file is an `.Rmd`, the output render marks up the output of code evaluation in `.md` format.\n\n\n::: {.cell layout-align=\"center\" preview='true'}\n::: {.cell-output-display}\n![Converting a Rmd file to many outputs using knitr and pandoc](https://d33wubrfki0l68.cloudfront.net/61d189fd9cdf955058415d3e1b28dd60e1bd7c9b/9791d/images/rmarkdownflow.png){fig-align='center' width=60%}\n:::\n:::\n\n\n\\[[Source](https://rmarkdown.rstudio.com/authoring_quick_tour.html)\\]\n\nAs seen in the figure above, from there pandoc is used to convert e.g. a `.md` file into many other types of file formats into a `.html`, etc.\n\nSo in summary:\n\n> \"R Markdown stands on the shoulders of knitr and Pandoc. The former executes the computer code embedded in Markdown, and converts R Markdown to Markdown. The latter renders Markdown to the output format you want (such as PDF, HTML, Word, and so on).\"\n\n\\[[Source](https://bookdown.org/yihui/rmarkdown/)\\]\n\n### distill\n\nThere is another great way to build blogs or websites using the [distill for R Markdown](https://rstudio.github.io/distill/).\n\n- \n\nDistill for R Markdown combines the technical authoring features of the [Distill web framework](https://github.com/distillpub/template) (optimized for scientific and technical communication) with [R Markdown](https://rmarkdown.rstudio.com), enabling a fully reproducible workflow based on literate programming [@knuth1984].\n\nDistill articles include:\n\n- Reader-friendly typography that adapts well to mobile devices.\n- Features essential to technical writing like $LaTeX$ math, citations, and footnotes.\n- Flexible figure layout options (e.g. displaying figures at a larger width than the article text).\n- Attractively rendered tables with optional support for pagination.\n- Support for a wide variety of diagramming tools for illustrating concepts. The ability to incorporate JavaScript and D3-based interactive visualizations.\n- A variety of ways to publish articles, including support for publishing sets of articles as a Distill website or as a Distill blog.\n\nThe course website from 2021 was built in Distill for R Markdown:\n\n- Website: \n- Github: \n\nSome other cool things about distill is the use of footnotes and asides.\n\nFor example [^1]. The number of the footnote will be automatically generated.\n\n[^1]: This will become a hover-able footnote\n\nYou can also optionally include notes in the gutter of the article (immediately to the right of the article text). To do this use the aside tag.\n\n\n\nYou can also include figures in the gutter. Just enclose the code chunk which generates the figure in an aside tag\n", "supporting": [ "index_files" ], diff --git a/posts/05-literate-programming/index.qmd b/posts/05-literate-programming/index.qmd index 2aa0204..0690aaa 100644 --- a/posts/05-literate-programming/index.qmd +++ b/posts/05-literate-programming/index.qmd @@ -79,36 +79,9 @@ The basic idea behind literate programming in order to generate the different ki In order to use a system like this you need a documentational language, that's human readable, and you need a programming language that's machine readable (or can be compiled/interpreted into something that's machine readable). -### Sweave - -One of the original literate programming systems in R that was designed to do this was called Sweave. Sweave enables users to combine R code with a documentation program called $LaTeX$. - -**Sweave files ends a `.Rnw`** and have R code weaved through the document: - -``` -<>= -data(airquality) -plot(airquality$Ozone ~ airquality$Wind) -@ -``` - -Once you have created your `.Rnw` file, Sweave will process the file, executing the R chunks and replacing them with output as appropriate before creating the PDF document. - -It was originally developed by Fritz Leisch, who is a core member of R, and the code base is still maintained by R Core. The Sweave system comes with any installation of R. - -There are many limitations to the original Sweave system. - -- One of the limitations is that it is **focused primarily on** $LaTeX$, which is not a documentation language that many people are familiar with. -- Therefore, it **can be difficult to learn this type of markup language** if you're not already in a field that uses it regularly. -- Sweave also **lacks a lot of features that people find useful** like caching, and multiple plots per page and mixing programming languages. - -Instead, folks have **moved towards using something called knitr**, which offers everything Sweave does, plus it extends it further. - -- With Sweave, additional tools are required for advanced operations, whereas knitr supports more internally. We'll discuss knitr below. - ### rmarkdown -Another choice for literate programming is to build documents based on [Markdown](https://en.wikipedia.org/wiki/Markdown) language. A markdown file is a plain text file that is typically given the extension `.md.`. The [`rmarkdown`](https://CRAN.R-project.org/package=rmarkdown) R package takes a R Markdown file (`.Rmd`) and weaves together R code chunks like this: +Today's main choice for literate programming is to build documents based on [Markdown](https://en.wikipedia.org/wiki/Markdown) language. A markdown file is a plain text file that is typically given the extension `.md.`. The [`rmarkdown`](https://CRAN.R-project.org/package=rmarkdown) R package takes a R Markdown file (`.Rmd`) and weaves together R code chunks like this: ```` ```{r plot1, height=4, width=5, eval=FALSE, echo=TRUE}`r ''` @@ -162,45 +135,6 @@ Check out the R Markdown Quick Tour for more: ![Artwork by Allison Horst on RMarkdown](https://github.com/allisonhorst/stats-illustrations/raw/main/rstats-artwork/rmarkdown_rockstar.png){width="80%"} -### knitr - -One of the alternative that has come up in recent times is something called `knitr`. - -- The `knitr` package for R takes a lot of these ideas of literate programming and updates and improves upon them. -- `knitr` still uses R as its programming language, but it allows you to mix other programming languages in. -- You can also use a variety of documentation languages now, such as LaTeX, markdown and HTML. -- `knitr` was developed by Yihui Xie while he was a graduate student at Iowa State and it has become a very popular package for writing literate statistical programs. - -Knitr takes a plain text document with embedded code, executes the code and 'knits' the results back into the document. - -For for example, it converts - -- An R Markdown (`.Rmd)` file into a standard markdown file (`.md`) -- An `.Rnw` (Sweave) file into to `.tex` format. -- An `.Rhtml` file into to `.html`. - -The core function is `knitr::knit()` and by default this will look at the input document and try and guess what type it is e.g. `Rnw`, `Rmd` etc. - -This core function performs three roles: - -- A **source parser**, which looks at the input document and detects which parts are code that the user wants to be evaluated. -- A **code evaluator**, which evaluates this code -- An **output renderer**, which writes the results of evaluation back to the document in a format which is interpretable by the raw output type. For instance, if the input file is an `.Rmd`, the output render marks up the output of code evaluation in `.md` format. - -```{r rmarkdown-wizards, echo = FALSE, fig.cap = "Converting a Rmd file to many outputs using knitr and pandoc", out.width = '60%', fig.align='center', preview=TRUE} -knitr::include_graphics("https://d33wubrfki0l68.cloudfront.net/61d189fd9cdf955058415d3e1b28dd60e1bd7c9b/9791d/images/rmarkdownflow.png") -``` - -\[[Source](https://rmarkdown.rstudio.com/authoring_quick_tour.html)\] - -As seen in the figure above, from there pandoc is used to convert e.g. a `.md` file into many other types of file formats into a `.html`, etc. - -So in summary: - -> "R Markdown stands on the shoulders of knitr and Pandoc. The former executes the computer code embedded in Markdown, and converts R Markdown to Markdown. The latter renders Markdown to the output format you want (such as PDF, HTML, Word, and so on)." - -\[[Source](https://bookdown.org/yihui/rmarkdown/)\] - # Create and Knit Your First R Markdown Document