Common R Packages
|
||
Package | Main use | Sample functions |
data.table | high-speed data wrangling |
fread() – read data from a flat file such as .csv or
.tsv.dcast() and melt() – reshape between
long and wide formats.join() - combine data tables.
|
dplyr | data manipulation |
mutate() – adds new variables using functions on existing
variables.select() – picks variables (columns) based
on their name.filter() picks rows based on their
values.summarize() – reduced multiple values down to a
single summary.arrange() – changes the ordering of
rows.
|
forcats | handling categorical variables |
fct_reorder() – reorders a factor by another variable.fct_relevel() – changes the order of factors manually.
|
lubridate | tools to work with date-time data |
now() – current time in time zone.
|
magrittr | useful operators for easier more readable code |
%>% – pipe the left-hand side values forward into
expressions on the right hand side of the operator.%<>% - pipe and assign a data frame in place.
|
purrr | tools for working with vectors and functions |
map() – allows you to replace many ‘for’ loops with code
that is both more succinct and easier to read.
|
readr | read data files (csv, tsv, etc.) in tidy format |
read_csv() – reads .csv files and loads them as tibbles.
|
readxl | read Excel files in tidy format |
read_excel() – read a .xls or .xlsx file.excel_sheets() - return a vector of sheet names.
|
stringr | working with strings |
str_replace() – replaces matching text in a string with new
text.str_extract() – extracts matching text from a
string.str_split() – splits strings into multiple
strings.
|
tibble | data classification and handling |
tibble() – constructs a data frame with special behaviors,
such as enhanced printing.
|
tidyr | data cleaning (creating tidy data) |
pivot_longer() and pivot_wider() – convert
between long and wide formats.drop_na() – removes rows
with missing values.
|
|
||
Package | Main use | Sample functions |
ggplot2 | drawing figures |
ggplot() – system for declaratively creating graphics,
based on “The Grammar of Graphics.”
|
gridExtra | working with graphical objects (grobs) on a grid |
arrangeGrob() - arrange multiple grobs on a page.
|
kableExtra | builds on the knitr package to construct complex and customizable tables |
kable() - create tables in LaTeX, HTML, Markdown, and
reStructuredText.
|
xtable | formatting tables into LaTeX and HTML |
xtable() - convert an R object into an xtable object that
can be printed as LaTeX or HTML.
|
|
||
Package | Main use | Sample functions |
car | expands statistical toolset for regression and analysis of variance models |
Anova() – calculates type 2 or type 3 sum of square
tables.vif() – calculates variance inflation factors
to assess multicollinearity.
|
caret | tools for predictive modeling |
train() – fits predictive models over different tuning
parameters
|
emmeans multcomp |
tools for multiple comparison testing |
emmeans() – calculates estimated marginal means for
specified factors or factor combinations in a linear model; and
optionally, comparisons or contrasts among them.glht()
– general linear hypotheses and multiple comparisons for parametric
models, including generalized linear models, linear mixed effects
models, and survival models.
|
Hmisc | useful functions for data analysis and statistics |
Cs() – creates character strings from unquoted names.describe() - concise statistical description of a vector,
matrix, data frame, or formula.
|
lme4 nlme |
linear and non-linear mixed effect modeling |
lmer() – fits linear mixed models.glmer()
– fits generalized linear mixed models.nlme() – fits
non-linear mixed models.
|
rstatix | pipe-friendly framework for performing basic statistical tests |
adjust_pvalue() – adds and adjusted p-value into a data
frame.add_significance() – adds p-value significance
symbols to a data frame.
|