R: Data description

Data Wrangling and Data Representation in R

Matteo Ploner

Creating the R Markdown output

R Markdown

  • What is R Markdown?
    • It combines the simple style of Markdown and the powerful computational environment of R
      • From RStudio Website: “R Markdown provides an authoring framework for data science. You can use a single R Markdown file to both save and execute code; generate high quality reports that can be shared with an audience.
    • A short video tutorial here
    • A comprehensive introduction to R Markdown can be found in R Markdown: The Definitive Guide

R Markdown: elements

  • A R Markdown document is made of 3 main components
    1. Markdown textual elements
      • Contains the text comment/description of the output of the R chunk code
    2. R chunk code
      • Contains the R code to generate the desired output (table, graph, results…)
    3. YAML header
      • Contains information about the document and formatting styles

What is Markdown?

  • Markdown is a lightweight markup language with plain text formatting syntax
Plain text
*italics* and _italics_
**bold** and __bold__
superscript^2^
~~strikethrough~~
[link](www.rstudio.com)
inline equation: $A = \pi*r^{2}$

What is Markdown? (ii)

* unordered list
* item 2
   - sub-item 1
   - sub-item 2

1. ordered list
2. item 2
   - sub-item 1
   - sub-item 2

Table Header  | Second Header
------------- | -------------
Table Cell    | Cell 2
Cell 3        | Cell 4

What is a chunk code?

  • You can insert R code into a chunk code
  • The chunk has this generic format
``` {r}
Your code here
```
  • Several features of the code chunk can be controlled (see here for a detailed list)
    • eval
      • =FALSE \(\Rightarrow\) the chunk is not evaluated
    • echo
      • =FALSE \(\Rightarrow\) no source code is printed in the output, only the result of the code
    • include
      • =FALSE \(\Rightarrow\) the chunk is excluded from the output, but still evaluated
    • warning, message, and error
      • =FALSE \(\Rightarrow\) warnings, messages and errors are not printed in the output
  • One can also control figure dimensions (in inches) with fig.width and fig.height
``` {r, eval=TRUE, echo=FALSE, include=TRUE, fig.width=9, fig.height=6}
Your code here
```

What is the YAML header?

  • The YAML header contains the properties of the document
    • Title, author, date
    • The output format
      • html, pdf, word …
    • Reference to external sources
      • .bib for bibliography
      • .css for style elements

How to create a R Markown report in RStudio (i)

How to create a R Markown report in RStudio (ii)

The R Markdown file (.Rmd)

R markdown presentation

  • You can easily prepare slides with R markdown in RStudio
    • title of each slides with ## (separator)

Appendix

Code template

  • Here you can find the R Markdown code to recreate the analysis of the PGG (see first part of the lecture)

  • R Markdown code:

Download DEMO_report.Rmd
  • PGG Data
Download data_PGG.csv

Assignment

  • Create a R Markdown file names “my_first_report.Rmd”
  • The output should be titled “My First Markdown Report” and your name as author
  • The output should display the density distribution of 100 random draws from a normal distribution with mean=0 and sd=1

  • The output should display a table with mean and sd
Mean SD
0.09 0.98
  • The plot and table should be accompanied by text commenting the results