feat: timezone conversion and assertion#170
feat: timezone conversion and assertion#170etiennebacher merged 19 commits intoetiennebacher:mainfrom atsyplenkov:lubridate
Conversation
|
Thanks, just so you know I won't be able to review this for a few days |
|
Hi @atsyplenkov, the PR looks good but I'd like the testing to be a bit more thorough. I have been adding more property-based testing to If you're not familiar with property-based testing in R, I have a blog post on this: https://www.etiennebacher.com/posts/2024-10-01-using-property-testing-in-r/ For example, for patrick::with_parameters_test_that(
"changing timezone works", {
for_all(
tests = 20,
datetime = posixct_(any_na = TRUE),
property = function(datetime) {
test_df <- data.frame(x1 = ymd_hms(datetime, tz = "UTC"))
test <- pl$DataFrame(x1 = ymd_hms(datetime, tz = "UTC"))
expect_equal_or_both_error(
mutate(
test,
x1 = force_tz(x1, tz)
),
mutate(
test_df,
x1 = force_tz(x1, tz)
),
tolerance = 1e-6
)
}
)
},
tz = list("Pacific/Auckland", "foobar", "", NA, c("Pacific/Auckland", "Africa/Abidjan"))
)Running this may already uncover some bugs in the current implementation. It's more tedious to write and to pass but when it does then we can be confident that Could you rework the tests to include this type of testing? Don't hesitate to ask if you run into some issues while doing this. Note that if you need to convert some polars expressions to R values for the |
|
Thanks for the review. I am not familiar with property-based testing, but your blog post looks promising. I'll revise the PR during this week and come back to you |
|
@etiennebacher I cannot yet write the property-based testing yet, because the current implementation of library(tidyverse)
devtools::load_all()
TimeZone <- base::OlsonNames()[100]
TimeZone
#> "America/Creston"
test_df <- data.frame(
dt_utc = c(
ymd_hms(
c(
"2012-03-26 12:00:00",
"2020-01-01 12:00:00",
"2023-12-14 12:00:00",
NA
),
tz = "UTC"
)
)
)
as_polars_df(test_df) |>
mutate(
dt_utc = with_tz(dt_utc, "America/Creston")
)
#> shape: (4, 1)
#> ┌───────────────────────────────┐
#> │ dt_utc │
#> │ --- │
#> │ datetime[ms, America/Creston] │
#> ╞═══════════════════════════════╡
#> │ 2012-03-26 05:00:00 MST │
#> │ 2020-01-01 05:00:00 MST │
#> │ 2023-12-14 05:00:00 MST │
#> │ null │
#> └───────────────────────────────┘
as_polars_df(test_df) |>
mutate(
dt_utc = with_tz(dt_utc, TimeZone)
)
#> Error in `mutate()`:
#> ! Error while running function `with_tz()` in Polars.
#> ✖ Invalid 'x' type in 'x && y'
#> Hide Traceback
#> ▆
#> 1. ├─dplyr::mutate(...)
#> 2. └─tidypolars:::mutate.RPolarsDataFrame(as_polars_df(test_df), dt_utc = with_tz(dt_utc, TimeZone))
#> 3. └─tidypolars:::translate_dots(...) at tidypolars/R/mutate.R:117:3
#> 4. └─base::lapply(...) at tidypolars/R/utils-expr.R:36:3
#> 5. └─tidypolars (local) FUN(X[[i]], ...)
#> 6. └─tidypolars:::translate_expr(...) at tidypolars/R/utils-expr.R:46:5
#> 7. └─tidypolars:::translate(...) at tidypolars/R/utils-expr.R:145:5
#> 8. └─base::tryCatch(...) at tidypolars/R/utils-expr.R:622:7
#> 9. └─base (local) tryCatchList(expr, classes, parentenv, handlers)
#> 10. └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
#> 11. └─value[[3L]](cond)
#> 12. └─rlang::abort(...) at tidypolars/R/utils-expr.R:638:13
I cannot wrap my head around on how to solve this issue. |
|
The issue is that when we handle I also just realized that This now works: patrick::with_parameters_test_that(
"changing timezone works", {
for_all(
tests = 20,
datetime = posixct_(any_na = TRUE),
property = function(datetime) {
test_df <- data.frame(x1 = ymd_hms(datetime, tz = "UTC"))
test <- pl$DataFrame(x1 = ymd_hms(datetime, tz = "UTC"))
expect_equal_or_both_error(
mutate(
test,
x1 = force_tz(x1, tz)
),
mutate(
test_df,
x1 = force_tz(x1, tz)
),
tolerance = 1e-6
)
}
)
},
tz = list("Pacific/Auckland", "foobar", "", NA)
)Can you also:
? |
|
Thanks! I'll do it. |
|
@atsyplenkov just FYI I plan on doing a new release before mid-March so let me know if you need some help to finish this PR before that. |
|
@etiennebacher Hey, sorry for taking so long. I have a very tight deadline at work and couldn't allocate enough time to finish this PR. I should have some time after the 25th of February, and I can submit my revisions then. |
|
No worries, I'm not in a rush at all, I just wanted to inform you in case those functions are important in some of your project . Take your time, good luck with work 🙂 |
|
Hey, just a quick update—I haven't forgotten about this PR, I'm just still extremely busy. |
|
No worries, finish when you can. I'll make a new release since there are interesting fixes and features I think. |
|
Gee, what a rabbit hole of the unrecognized timezone handling. TBH, I don't know what to do. The problem is that Therefore, I am keen to throw an early error when an unrecognized timezone is passed. This will be reflected in the number of property-based tests we can implement. See my comment there Apart from that (sorry for the useless commit name, I accidentally hit enter too early):
|
| @@ -1,3 +1,6 @@ | |||
| # TODO: | |||
| # Sometimes this test fails, e.g. try seed == 1 | |||
| set.seed(123) | |||
There was a problem hiding this comment.
On my machine, the "extracting date components works" test is failing with set.seed(1). Can you check if it the case with you as well?
There was a problem hiding this comment.
I think this is an error in polars:
date <- as.POSIXct("2493-05-15 00:00:00 CEST")
data.frame(x1 = date)
#> x1
#> 1 2493-05-15
polars::pl$DataFrame(x1 = date)
#> shape: (1, 1)
#> ┌─────────────────────┐
#> │ x1 │
#> │ --- │
#> │ datetime[ms] │
#> ╞═════════════════════╡
#> │ 2493-05-14 23:00:00 │ # <<<<<<<<< shows 2493-05-14 and not 2493-05-15
#> └─────────────────────┘There was a problem hiding this comment.
Good to know, you've got an eagle eye!
tests/testthat/test-funs-date-lazy.R
Outdated
| # TODO: see https://github.com/tidyverse/lubridate/issues/1186 | ||
| # If lubridate changed the behavior to error, then we can add "foo" and "" | ||
| tz = list("Pacific/Auckland", "US/Alaska", NA) | ||
| tz = list("Pacific/Auckland", "Europe/Rome", NA) |
There was a problem hiding this comment.
Looks like "US/Alaska" in not in OlsonNames()
Thanks for opening the upstream issue, I think this should error so I'll wrap up the PR, many thanks for the contribution! |
|
Awesome, thanks for your guidance and help! Just to let you know, I started wrapping some other |
|
Great, looking forward to seeing them! |
Hey @etiennebacher 👋
I decided to start by adding more support to
lubridatepackage. I can see that it is a bit less structured than the other code in your package, so I am unsure how to best write it. If you are happy with the following approach, I will continue fillingtidypolarswithlubridatefunctions.P.S. Note that I have added the
check_timezonefunction, which throws an error earlier if the timezone code is not supported bypolars. TIL that somepolarsfunctions allow a NULL timezone, while others do not.