Note that as of 2024, some of pivot_wider_alt()
's functionality is now
supported in tidyr::pivot_wider()
. In particular, the functionality of
names_value_first = TRUE
in pivot_wider_alt()
can now be achieved by
using names_vary = "slowest"
in pivot_wider()
. The functionality of
names_value_sep
can be achieved using the names_glue
argument in
pivot_wider()
.
A wrapper around tidyr::pivot_wider()
with additional
options for sorting and naming of output columns, with arguments
sort_by_col
, names_value_first
, and names_value_sep
These options are
only relevant when more than one input column is passed to values_from
.
Usage
pivot_wider_alt(
data,
id_cols = NULL,
names_from = name,
sort_by_col = TRUE,
names_value_first = TRUE,
names_value_sep = ".",
names_sep = "_",
names_prefix = "",
names_glue = NULL,
names_repair = "check_unique",
values_from = value,
values_fill = NULL,
values_fn = NULL
)
Arguments
- data
A data frame to pivot.
- id_cols
<
tidy-select
> A set of columns that uniquely identify each observation. Typically used when you have redundant variables, i.e. variables whose values are perfectly correlated with existing variables.Defaults to all columns in
data
except for the columns specified throughnames_from
andvalues_from
. If a tidyselect expression is supplied, it will be evaluated ondata
after removing the columns specified throughnames_from
andvalues_from
.- sort_by_col
If
TRUE
(default), output columns will be sorted bynames_from
, thenvalues_from
. (Differs fromtidyr::pivot_wider()
, which sorts byvalues_from
first, thennames_from
.)- names_value_first
If
FALSE
, output columns will be named using a{column}_{.value}
scheme. (Differs fromtidyr::pivot_wider()
, which uses a{.value}_{column}
scheme.)- names_value_sep, names_sep
If
names_from
orvalues_from
contain multiple variables, these will be used to join their values together into a single string to use as a column name.names_value_sep
will separate{.value}
from{column}
components, whilenames_sep
will separate{column}
components from one another ifnames_from
contains multiple variables. See Details and Examples.- names_prefix, names_glue, names_repair, values_from, values_fill, values_fn
See documentation at
tidyr::pivot_wider()
.
Details
#' The names_value_sep
argument allows output column names that use a
different separator between {.value}
and {column}
than between multiple
{columns}
s. Example:
Examples
data_ex <- ggplot2::diamonds %>%
dplyr::group_by(cut, color) %>%
dplyr::summarize(Min = min(price), Median = median(price), Max = max(price))
#> `summarise()` has grouped output by 'cut'. You can override using the `.groups`
#> argument.
# default pivot_wider() behavior
data_ex %>%
tidyr::pivot_wider(
id_cols = color,
names_from = cut,
values_from = Min:Max
)
#> # A tibble: 7 × 16
#> color Min_Fair Min_Good `Min_Very Good` Min_Premium Min_Ideal Median_Fair
#> <ord> <int> <int> <int> <int> <int> <dbl>
#> 1 D 536 361 357 367 367 3730
#> 2 E 337 327 352 326 326 2956
#> 3 F 496 357 357 342 408 3035
#> 4 G 369 394 354 382 361 3057
#> 5 H 659 368 337 368 357 3816
#> 6 I 735 351 336 334 348 3246
#> 7 J 416 335 336 363 340 3302
#> # ℹ 9 more variables: Median_Good <dbl>, `Median_Very Good` <dbl>,
#> # Median_Premium <dbl>, Median_Ideal <dbl>, Max_Fair <int>, Max_Good <int>,
#> # `Max_Very Good` <int>, Max_Premium <int>, Max_Ideal <int>
# pivot_wider_alt() behavior
data_ex %>%
pivot_wider_alt(
id_cols = color,
names_from = cut,
values_from = Min:Max
)
#> # A tibble: 7 × 16
#> color Min.Fair Median.Fair Max.Fair Min.Good Median.Good Max.Good
#> <ord> <int> <dbl> <int> <int> <dbl> <int>
#> 1 D 536 3730 16386 361 2728. 18468
#> 2 E 337 2956 15584 327 2420 18236
#> 3 F 496 3035 17995 357 2647 18686
#> 4 G 369 3057 18574 394 3340 18788
#> 5 H 659 3816 18565 368 3468. 18640
#> 6 I 735 3246 18242 351 3640. 18707
#> 7 J 416 3302 18531 335 3733 18325
#> # ℹ 9 more variables: `Min.Very Good` <int>, `Median.Very Good` <dbl>,
#> # `Max.Very Good` <int>, Min.Premium <int>, Median.Premium <dbl>,
#> # Max.Premium <int>, Min.Ideal <int>, Median.Ideal <dbl>, Max.Ideal <int>
# with `names_value_first` = FALSE
data_ex %>%
pivot_wider_alt(
id_cols = color,
names_from = cut,
values_from = Min:Max,
names_value_first = FALSE
)
#> # A tibble: 7 × 16
#> color Fair.Min Fair.Median Fair.Max Good.Min Good.Median Good.Max
#> <ord> <int> <dbl> <int> <int> <dbl> <int>
#> 1 D 536 3730 16386 361 2728. 18468
#> 2 E 337 2956 15584 327 2420 18236
#> 3 F 496 3035 17995 357 2647 18686
#> 4 G 369 3057 18574 394 3340 18788
#> 5 H 659 3816 18565 368 3468. 18640
#> 6 I 735 3246 18242 351 3640. 18707
#> 7 J 416 3302 18531 335 3733 18325
#> # ℹ 9 more variables: `Very Good.Min` <int>, `Very Good.Median` <dbl>,
#> # `Very Good.Max` <int>, Premium.Min <int>, Premium.Median <dbl>,
#> # Premium.Max <int>, Ideal.Min <int>, Ideal.Median <dbl>, Ideal.Max <int>
# multiple `names_from` vars, with different value vs. name separators
ggplot2::mpg %>%
dplyr::filter(class %in% c("compact", "subcompact", "midsize")) %>%
dplyr::group_by(
manufacturer,
trans = stringr::str_extract(trans, ".*(?=\\()"), year
) %>%
dplyr::summarize(across(c(cty, hwy), mean)) %>%
pivot_wider_alt(
names_from = trans:year,
values_from = cty:hwy,
names_sep = "_",
names_value_sep = ": "
)
#> `summarise()` has grouped output by 'manufacturer', 'trans'. You can override
#> using the `.groups` argument.
#> # A tibble: 10 × 9
#> # Groups: manufacturer [10]
#> manufacturer `cty: auto_1999` `hwy: auto_1999` `cty: auto_2008`
#> <chr> <dbl> <dbl> <dbl>
#> 1 audi 16 25.8 18
#> 2 chevrolet 18.5 26.5 19
#> 3 ford 16.5 23 15.5
#> 4 honda 24 32 24.5
#> 5 hyundai 18.3 26 19.2
#> 6 nissan 18.5 26.5 20.3
#> 7 pontiac 17 26.3 17
#> 8 subaru 20 26 20
#> 9 toyota 21 28.2 21.2
#> 10 volkswagen 19.4 28.1 20.2
#> # ℹ 5 more variables: `hwy: auto_2008` <dbl>, `cty: manual_1999` <dbl>,
#> # `hwy: manual_1999` <dbl>, `cty: manual_2008` <dbl>,
#> # `hwy: manual_2008` <dbl>