lighthouse 0.7.3
New functions
-
ffyq()andsfyq_il()return the federal fiscal year and quarter or Illinois state fiscal year and quarter for a given date. Return format can be set using thetypeparameter, defaulting to numericYYYY.Qformat. These functions wraplubridate::quarter()and complement the existing lighthouse functionsffy()andsfy_il().
New functionality
-
summary_table()has new arguments.cols_group_glueand.cols_group_orderto control column names and order when.cols_group_byis set. These are replacements for.cols_group_opts, which is deprecated and will be removed in a future release.
Bug fixes
fiscal_year(),ffy(), andsfy_il()are now vectorized (fixes #23).summary_table()now accepts functions that do not have anna.rmor...argument, which previously caused an error.-
- now returns results with default formatting when
formatis unspecified (fixes #21). - now supports the
"%OSn"conversion specification. This returns seconds with specified number of decimal places, up to 6; e.g.,"%OS3"would return seconds with 3 decimals places.
- now returns results with default formatting when
-
- no longer issues a deprecation warning related to using
!!!on a single language object (fixes #19). - now returns consistent column types in output tibble (fixes # 26).
- setting
.missing_labelno longer throws errors in some situations (fixes #26). - now treats dates and datetimes as nominal by default, and will error on attempts to treat dates or datetimes as continuous or binary.
- no longer issues a deprecation warning related to using
lighthouse 0.7.2
Bug fixes
-
open_file(),open_location(), andin_excel()now work on MacOS (fixes #17).
Changes to row-wise aggregation functions
-
These compute “parallel” or row-wise sums or means, analogous to
base::pmax()andbase::pmin().psum()deprecatesrow_sums_spss(). (psum()is a clearer and more consistent name, as its behavior is closer to that ofpmin()/pmax()thanrowSums().) Note thatpsum()hasna.rm = FALSEby default whereasrow_sums_spss()defaulted tona.rm = TRUE.
-
Added
psum_across()andpmean_across():These are implementations of
psum()andpmean()that take tidyselect expressions, complementingpmin_across()andpmax_across().psum_across()replacesrow_sums_across(), which was introduced in 0.7.0 but is now removed (closes #16).
All
p*_across()functions now accept tidyselect expressions via...rather thancols. This makes it easier to include multiple tidyselect expressions, e.g.,psum_across(var1:var9, starts_with("An")).Updated documentation for
psum(),psum_across(), and friends. In particular, see the Details section ofpsum_across(), which contrasts use cases forpsum()vs.psum_across().
Lifecycle changes
-
coerce_na_range()is deprecated in favor ofna_if_range().
lighthouse 0.7.1
Bug fix
-
strftime_no_lead()now removes leading zeroes only from specified components of date-times (fixes #14).
lighthouse 0.7.0
New functions
Summary functions
-
summary_report()returns a summary of multiple variables, summarizing each variable based on its level of measurement. -
df_compare()is a utility for identifying differences between data frames. Given two data frames, it returns only rows and columns with differences.
Tools for missing values
-
na_if_range()is a renamed, expanded, and bug-fixed version ofcoerce_na_range().coerce_na_range()is retained as an alias for back compatibility. -
drop_na_rows()drops rows where all columns or a specific subset of columns are allNA. -
first_valid(),last_valid(),nth_valid()return the nth non-missing value in a vector.
Tools for character vectors
-
str_c_narm()is a variant ofstringr::str_c()with alternative handling ofNAs. -
str_c_tidy()is a variant ofstringr::str_c()that accepts tidyselect expressions. -
str_ends_any()was added to complementstr_starts_any()andstr_detect_any().
Tools for dates
-
ffy()andsfy_il()return the federal fiscal year or Illinois state fiscal year for a given date. They wrapfiscal_year(), which returns the fiscal year based on a specified starting month. -
strftime_no_lead()formats a date without leading zeroes (e.g., “6/7/2024” instead of “06/07/2024”). -
nth_bizday()is a generalization ofnext_bizday().
Tools for service cascades
-
cascade_fill_bwd()andcascade_fill_fwd()impute values into service cascade data based on previous or subsequent cascade steps. -
cascade_summarize()returns a summary table for service cascade data. - These functions are not yet fully documented.
Statistical functions
-
se_mean()andse_prop()compute standard error of the mean and of a proportion, respectively.se_prop()includes checks for unreliability due to low variance; see its “Details.”se_mean()replaces the ambiguously-namedse(), which is now deprecated. -
ci_sig()tests if a confidence interval indicates statistical significance. -
OR_to_p1()andOR_to_p2()convert odds ratios to probabilities. They complementp_to_OR(). -
dunn_test()performs Dunn’s test, a pairwise post-hoc test for following up a Kruskal-Wallis test.
Math functions
-
row_sums_across()is a variant ofbase::rowSums()that accepts tidyselect expressions and has alternativeNAhandling. -
sum_if_any(),min_if_any(), andmax_if_any()are variants ofsum(),min(), andmax()that removeNAs unless all values areNA.min_if_any()andmax_if_any()were renamed fromsafe_min()andsafe_max().
Variable transformation
-
fct_collapse_alt()is a variant offorcats::fct_collapse()with options to handle non-existent values and level ordering. -
fct_na_if()is a variant ofdplyr::na_if()that also removes the specified value from a factor’s levels. -
swap()swaps values between two columns. It is an unconditional variant ofswap_if().
Data restructuring
-
add_rows_at_value()is similar toadd_blank_rows(), but allows specifying position by column values rather than row numbers. Note there have been some changes in the function interface from the pre-release version; see the “Details” section of the documentation. -
pad_vectors()pads a list of vectors withNAs to a common length.
Exporting results
-
add_plot_slide()is a helper for exporting plots to PowerPoint with easier control of size and positioning. -
write_xlsx_styled()writes to .xlsx with basic column formatting.
Data visualization
-
add_crossings()is a helper for creating area charts with different fills for positive vs negative values. -
after_opacity()andbefore_opacity()are utilities for color blending.
Other
-
open_file()(aliasfile.open()) opens a file with its default application.open_folder()(aliasdir.open()) opens a folder in the system file manager. - Given two vectors,
set_compare()returns labelled subsets of unique and shared elements. -
suppress_warnings_if()andsuppress_messages_if()conditionally suppress warnings or messages based on their text. -
eq_shape()checks if two objects have the same number of dimensions and same length along each dimension.
New datasets
gain_missing_codesis a quick reference for missing value labels used in GAIN datasets.state.terr.nameandstate.terr.abbare versions ofstate.nameandstate.abbthat include US territories and the District of Columbia.state.terr.datais a data frame including names, abbreviations, and FIPS codes for US states, territories, and the District of Columbia.
Added functionality
count_pct()andcount_multiple()now support the.byargument for per-operation grouping. Integration of.byinto othercount_*()functions is planned for a future update.In
summary_table(), the column of variable names can be dropped when only one variable is included by setting.var_col_name = NULL(#9).count_duplicates()now returns both the unique and total number of duplicated values. (e.g.,c(2, 2, 4, 4)has two unique and four total values.)Added a
missingargument toswap_if()with options for cases where the condition is missing.Added a
warn_factorargument totry_numeric()
Bug fixes
The
.cols_group_byargument insummary_table()now produces separate columns by group (fixes #6).count_with_total()now produces totals for non-character columns (fixes #10).days_diff()now handles inputs of different types (e.g., a date and a datetime) with a warning (previously threw an error).Added General Election Day to
holidays_iland arranged by date (fixes #1).Removed Inauguration Day from
holidays_us.
Lifecycle changes
rbool()has been un-deprecated. It was previously deprecated in favor ofpurrr::rbernoulli(), butpurrr::rbernoulli()has since been deprecated itself.pivot_wider_alt()is defunct. Changes totidyr::pivot_wider()made its most important functionality unnecessary. Further changes to tidyr broke it, and it was judged not worth the effort of fixing.na_like()andmedian_dbl()are deprecated. They are no longer needed given more flexible handling of mixed classes bydplyr::if_else()anddplyr::case_when()as of [dplyr v1.1.0][https://dplyr.tidyverse.org/news/index.html#vctrs-1-1-0]. (Plusna_like()was quite buggy and unreliable; resolves #2).row_sums_spss()deprecated in favor ofrow_sums_across().safe_min()andsafe_max()renamed tomin_if_any()andmax_if_any(); the old names are deprecated.
Other changes
In
asterisks(), changed the default forinclude_keyfromTRUEtoFALSE.
lighthouse 0.6.0
New functions
- Grouping and summary functions:
- Statistical and math functions:
- Data restructuring:
- For working with missing values:
- For working with strings:
- For working with dates:
- Logical tests:
- Other:
Other changes
- Added datasets for federal (
holidays_us), Illinois (holidays_il), and Chestnut Health Systems (holidays_chestnut) holidays (primarily for use withnext_bizday()` function). - Added
strictargument tois_TRUE(),is_FALSE(),is_TRUE_or_NA(), andis_FALSE_or_NA() - Improvements to
set_ggplot_opts(),ggview(), andis_coercible_numeric() - Bugfixes for
in_excel(),count_na(),summary_table(),pivot_wider_alt(),print_all(),asterisks(), andcoerce_na_range() - Remove check for
lighthouseupdates on load
lighthouse 0.5.0
- Check if
lighthouseupdate is available on load - New infix operators:
%all_in%,%any_in% - Exported
na_like()
lighthouse 0.4.1
- Bugfix for
print_all()
lighthouse 0.4.0
- New logical tests:
is_TRUE(),is_FALSE(),is_TRUE_or_NA(),is_FALSE_or_NA(),is_coercible_numeric() - New count functions:
crosstab(),count_na() - New data transformations:
scale_mad(),winsorize() - New date functions:
floor_month(),floor_week(),floor_days(),days_diff() - Other new functions:
asterisks(),print_n(),print_all(),na_to_null(),set_ggplot_opts() - Added added optional
nameargument toin_excel()
