Summarizes each variable passed to ...
. This is handled differently based
on each variable's level of measurement:
For nominal variables, returns n and proportion for each level
For binary variables, returns n and proportion
TRUE
For continuous variables, returns mean and standard deviation by default. Specify alternative summary statistics using
.cont_fx
.
By default, summary_report()
will guess the measurement level for each
variable. This can be overridden for all variables using the .default
argument, or for select variables using the nom()
, bin()
, or cont()
measurement wrappers. See details.
Arguments
- .data
a data frame or data frame extension.
- ...
<
tidy-select
> one or more variable names. and/or tidyselect expressions. Elements may be wrapped innom()
,bin()
, orcont()
to force summarizing as binary, nominal, or continuous, respectively; see details.- .default
how to determine measurement level for variables if not specified by a measurement wrapper.
"auto"
will guess measurement level for each variable, while"nom"
,"bin"
, and"cont"
will treat all unwrapped variables as nominal, binary, or continuous, respectively.- .drop
if
FALSE
, frequencies for nominal variables will include counts for empty groups (i.e. for levels of factors that don't exist in the data).- .cont_fx
a list containing the two functions with which continuous variables will be summarized.
- .missing_label
label for missing values in nominal variables.
- na.rm
if
TRUE
,NA
values in each variable will be dropped prior to computation.- na.rm.nom, na.rm.bin, na.rm.cont
control
NA
handling specifically for nominal, binary, or continuous variables. Overridesna.rm
for that variable type.
Value
A tibble with four columns:
Variable
: Variable nameValue
:For nominal variables, a row for each unique value (including unobserved factor levels if
.drop = FALSE
).For binary variables, either
TRUE
or1
(for logical or numeric variables, respectively).For continuous variables, the names of the summary statistics specified in
.cont_fx
.
V1
:For nominal and binary variables, the number of observations with the value in
Value
.For continuous variables, the value of the first summary statistic.
V2
:For nominal and binary variables, the proportion of observations with the value in
Value
.For continuous variables, the value of the second summary statistic.
Determining measurement level
The measurement level for each variable is determined as follows:
Variables wrapped in
nom()
,bin()
, orcont()
will be treated as nominal, binary, or continuous, respectively.Variables without a measurement wrapper will be treated as the type specified in
.default
.If
.default
is"auto"
, measurement level will be inferred:Logical vectors will be treated as binary if there are no missing values or if
na.rm.bin = TRUE
.Character vectors, factors, dates and datetimes, and logical vectors with missing values will be treated as nominal.
All other variables will be treated as continuous.
Support for binary variables
To be treated as binary, both of these must be true:
The variable must be either a logical vector, or a binary numeric vector containing only 0s and 1s.
The variable must not include any missing values, or
na.rm.bin
must be set toTRUE
.
Future extensions may allow handling of other dichotomous variables (e.g.,
"Pregnant"
vs. "Not pregnant"
), but this is not currently supported.
Instead, consider converting these to a logical indicator, e.g., Pregnant = PregnancyStatus == "Pregnant"
.