This variant of dplyr::count()
returns the number of duplicate observations across the specified columns.
Returns the number of unique duplicated values, as well as the total number
of duplicated observations.
Value
A data frame with columns:
instances
: The number of times each unique value is duplicatedn_unique
: The number of unique values duplicatedinstances
timesn_total
: The total number of observations duplicatedinstances
times
Examples
df <- tibble::tibble(
x = c(1, 1, 2, 3, 3),
y = c('a', 'a', 'b', 'c', 'c')
)
count_duplicates(df)
#> # A tibble: 1 × 3
#> instances n_unique n_total
#> <int> <int> <int>
#> 1 5 1 5
count_duplicates(df, x)
#> # A tibble: 2 × 3
#> instances n_unique n_total
#> <int> <int> <int>
#> 1 1 1 1
#> 2 2 2 4
count_duplicates(df, y)
#> # A tibble: 2 × 3
#> instances n_unique n_total
#> <int> <int> <int>
#> 1 1 1 1
#> 2 2 2 4