Although it is not required to use just one of these, it is best to choose one and use it consistently. However, it is useful to be able to read Base R code, as it may be used in tutorials, is necessary for certain packages, and can simplify code in some circumstances.
$
and []
referring to object parts (see foundations).|>
filter(rows) |>
select(columns)|>
is now built into R, but the tidyverse originally used %>%
(from magrittr)If you really can't decide, check out this comparison for examples of each.
Scheme | Data Class | Template | Advantages |
---|---|---|---|
Base R | data.frame | mydata[rows,columns] | universal, historical |
Data Table | data.table | mydata[rows,columns,by] | fastest for Big Data |
Tidyverse | tibble | mydata |> filter(rows) |> select(columns) | easiest to use |
What is tidy data? See the classic article: Tidy Data (pdf) by Hadley Wickham, uses an older version of tidyr
filter()
- Keep/Remove Rows/Observationsselect()
- Keep/Remove Variablesmutate()
- Create or change variables
group_by()
|> summarize()
- Summarize/Collapse across rowsAsk a Librarian | Hours & Directions | Mason Libraries Home
Copyright © George Mason University