Wrappers around base::pmin()
, base::pmax()
, lighthouse::psum()
, and
lighthouse::pmean()
that accept
tidyselect expressions
.
Usage
psum_across(..., na.rm = FALSE)
pmean_across(..., na.rm = FALSE)
pmin_across(..., na.rm = FALSE)
pmax_across(..., na.rm = FALSE)
Arguments
- ...
<
tidy-select
> one or more tidyselect expressions that capture numeric and/or logical columns.- na.rm
Should missing values (including
NaN
) be removed?
Details
Lighthouse includes two sets of functions for computing "parallel" or row-wise aggregates:
psum()
andpmean()
(which complementbase::pmin()
andpmax()
)pmin_across()
,pmax_across()
,psum_across()
, andpmean_across()
Both sets of functions differ from base::rowSums()
and rowMeans()
in that
they:
work in data-masking contexts (e.g., inside
dplyr::mutate()
) without needing helpers likedplyr::pick()
ordplyr::across()
.accept multiple inputs via
...
.return
NA
whenna.rm = TRUE
and all values in a row areNA
. This mirrors behavior ofbase::pmin()
andpmax()
, but differs fromrowSums()
, which returns0
in this situation.
psum_across()
and friends support tidyselect expressions; e.g.,
dat
mutate(
IDScrTotal = psum_across(IDScr1:IDScr6),
SDScrTotal = psum_across(starts_with("SDScr"))
)
...but must be used inside a data-masking verb like dplyr::mutate()
,
group_by()
, or filter()
, and do not support implicit computations.
Conversely, psum()
and friends do not support tidyselect expressions, but
can be used both inside or outside a data-masking context:
# data-masking
dat
mutate(
NumColors = psum(Red, Blue, Green),
)
#non-data masking
psum(1:10, 6:15, 11:20)
and support "on the fly" or "implicit" computations:
Examples
dat <- tibble::tribble(
~product, ~price1, ~price2, ~price3,
"Product 1", 20, 25, 22,
"Product 2", NA, 30, 29,
"Product 3", 15, NA, NA,
"Product 4", NA, NA, NA
)
price_cols <- c("price1", "price2", "price3")
dat %>%
dplyr::mutate(
min = pmin_across(price1, price2, price3, na.rm = TRUE),
max = pmax_across(price1:price3, na.rm = TRUE),
sum = psum_across(starts_with("price"), na.rm = TRUE),
mean = pmean_across(all_of(price_cols), na.rm = TRUE)
)
#> # A tibble: 4 × 8
#> product price1 price2 price3 min max sum mean
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Product 1 20 25 22 20 25 67 22.3
#> 2 Product 2 NA 30 29 29 30 59 29.5
#> 3 Product 3 15 NA NA 15 15 15 15
#> 4 Product 4 NA NA NA NA NA NA NA