Releases: ropensci/targets
Improved crew integration
targets 1.2.0
crew
integration
- Do not assume S3 classes when validating
crew
controllers. - Suggest a crew controller in the
_targets.R
file fromuse_targets()
. - Make
tar_crew()
compatible withcrew
>= 0.3.0. - Rename argument
terminate
toterminate_controller
intar_make()
. - Add argument
use_crew
intar_make()
and add an option intar_config_set()
to make it configurable. - Write progress data and metadata in
target_prepare()
.
Other improvements
CRAN patch 3
targets 1.1.3
- Decide on
nanonext
usage intime_seconds_local()
at runtime and not installation time. That way, ifnanonext
is removed aftertargets
is installed, functions intargets
still work. Fixes the CRAN issues seen intarchetypes
,jagstargets
, andgittargets
.
Remarks
R CMD check shows a NOTE with messages such as "#STDOFF 2:05:08.9". This is caused by an issue in the arrow
package (apache/arrow#35594) which is in "Suggests:" in the DESCRIPTION file of targets
. The NOTE will go away on its own when the next arrow
is released to CRAN.
CRAN patch 2
targets 1.1.2
- Remove
crew
-related startup messages.
Remarks
R CMD check shows a NOTE with messages such as "#STDOFF 2:05:08.9". This is caused by an issue in the arrow
package (apache/arrow#35594) which is in "Suggests:" in the DESCRIPTION file of targets
. The NOTE will go away on its own when the next arrow
is released to CRAN.
CRAN patch
targets 1.1.1
- Pre-compute
cli
colors and bullets to improve performance in RStudio. - Use
packageStartupMessage()
for package startup messages.
Remarks
R CMD check shows a NOTE with messages such as "#STDOFF 2:05:08.9". This is caused by apache/arrow#35594 because the arrow
package is in "Suggests", in the DESCRIPTION file of targets
. The NOTE will go away on its own when the next arrow
is released to CRAN.
Major improvements to robustness, speed, and {crew} integration
targets 1.1.0
Bug fixes
- Send targets to the appropriate controller in a controller group when
crew
is used.
General improvements
- Call
gc()
more appropriately whengarbage_collection
isTRUE
intar_target()
. - Add
garbage_collection
arguments totar_make()
,tar_make_clustermq()
, andtar_make_future()
to add optional garbage collection before targets are sent to workers. This is different and independent from thegarbage_collection
argument oftar_target()
. In high-performance computing scenarios, the former controls what happens on the main controlling process, whereas the latter controls what happens on the worker. - Add
garbage_collection
andseconds_interval
arguments totar_make()
,tar_make_clustermq()
,tar_make_future()
, andtar_config_set()
. - Downsize the
tar_runtime
object. - Remove the 100 Kb file size cutoff for determining whether to trust the file timestamp or recompute the hash when checking if a file is up to date (#1062). Instate the
"file_fast"
format and thetrust_object_timestamps
option intar_option_set()
as safer alternatives. - Consolidate store constructors.
- Allow
crew
controller groups (#1065, @mglev1n). - Expose more exponential backoff configuration parameters through
tar_backoff()
. Thebackoff
argument oftar_option_set()
now accepts output fromtar_backoff()
, and supplying a numeric is deprecated. - Fix the exponential backoff rules in the
crew
scheduling algorithm. - Implement
tar_resources_network()
to configure retries and timeouts for internal HTTP/HTTPS requests in specialized targets withformat = "url"
,repository = "aws"
, andrepository = "gcp"
. Also applies to syncing target files across network file systems in the case ofstorage = "worker"
orformat = "file"
, which previously had a hard-codedseconds_interval = 0.1
andseconds_timeout = 60
. - Deprecate
seconds_interval
andseconds_timeout
intar_resources_url()
in favor of the new equivalent arguments oftar_resources_network()
- Safely withhold a target from its
crew
controller when the controller is saturated (#1074, @mglev1n). - Use exponential backoff when appending a target back to the queue in the case of a saturated
crew
controller.
Speedups
- Cache info about all of
_targets/objects/
intar_callr_inner_try()
and update the cache as targets are saved to_targets/objects/
to avoid the overhead of repeated calls tofile.exists()
andfile.info()
(#1056). - Trust the timestamps by default when checking whether files in
_targets/objects/
are up to date (#1062).tar_option_set(trust_object_timestamps = FALSE)
ignores the timestamps and recomputes the hashes. - Write to
_targets/meta/meta
and_targets/meta/progress
in timed batches instead of line by line (#1055). - Reporters now print progress messages in timed batches instead of line by line (#1055).
- The summary and forecast reporters are much faster because they avoid going through data frames.
- Avoid
tempfile()
when working with the scratch directory. - Use
nanonext::mclock()
instead ofproc.time()
when there is no risk of forked processes. - Replace
withr
with slightly faster/leaner base R alternatives. - Efficiently catch changes to the working directory instead of overburdening the pipeline with calls to
setwd()
(#1057). - Invoke
tar_options
methods in the internals instead oftar_option_get()
. - Avoid
gsub()
instore_init()
. - Avoid repeated calls to
meta$get_record()
inbuilder_should_run()
. - Mock the store object when creating a record from a metadata row.
- Avoid
cli::col_none()
to reduce the number of ANSI characters printed to the R console.
Integration with {crew}
targets 1.0.0
targets
is moving to version 1.0.0 because it is significantly more mature than previous versions. Specifically,
tar_make()
now integrates withcrew
, which will significantly improve the waytargets
does high-performance computing going forward.- All other functionality in
targets
has stabilized. There is still room for smaller new features, but none as large ascrew
integration, none that will fundamentally change how the package operates.
Major improvements
- Support distributed computing through the
crew
package intar_make()
(#753).crew
itself is still in its early stages and currently lacks the launcher plugins to match theclustermq
andfuture
backends, but long-term,crew
will be the predominant high-performance computing backend.
Minor improvements
- Add a new
store_copy_object()
to the store class to enable"fst_dt"
and other formats to make deep copies when needed (#1041, @MilesMcBain). - Add a new
copy
argument to allowtar_format()
formats to set thestore_copy_object()
method (#1041, @MilesMcBain). - Shorten the output string returned by
tar_format()
when default methods are used. - Add a
change_directory
argument totar_source()
(#1040, @dipterix). - In
format = "url"
targets, implement retries and timeouts when connecting to URLs. The default timeout is 10 seconds, and the default retry interval is 1 second. Both are configurable viatar_resources_url()
(#1048). - Use
parallelly::freePort()
intar_random_port()
. - Rename a target and a function in the
tar_script()
example pipeline (#1033, @b-rodrigues). - Edit the description.
CRAN patch
targets 0.14.3
- Handle encoding errors while trying to process error and warning messages (#1019, @adrian-quintario).
- Fix S3 generic/method consistency.
Error handling tweaks
targets 0.14.2
- Forward user-level custom error conditions to the top of the pipeline (#997, @alexverse).
- Link to the help page of the manual.
Bug fixes
targets 0.14.1
- Fix the command inserted for debug mode (#975).
- Set empty chunk options to ensure Target Markdown compatibility with the special "setup" chunk (#973, @KaiAragaki).
- Only store the first 50 warnings in the metadata, and cap the text of the warning messages at 2048 characters (#983, @thejokenott).
- Enhance the
tar_destroy()
help file (#988, @Sage0614). - Implement
destroy = "user"
intar_destroy()
.
Extend RNG seed functionality
targets 0.14.0
- Move
#!/bin/sh
line to the top of SLURMclustermq
template file (#944, #955, @GiuseppeTT). - Add new function
tar_path_script()
. - Rename
tar_store()
totar_path_store()
with deprecation. - Rename
tar_path()
totar_path_target()
with deprecation. - Add new function
tar_path_script_support()
. - Make Target Markdown target scripts dynamically locate their support scripts so the appropriate scripts can be found even when they are generated from one directory and sourced from another (#953, #957, @TylerGrantSmith).
- Allow user-side control of the seeds at the pipeline level.
tar_option_set()
now supports aseed
argument, and target-specific seeds are determined bytar_option_get("seed")
and the target name.tar_option_set(seed = NA)
disables seed-setting behavior but forcibly invalidates all the affected targets except whenseed
isFALSE
in the target'star_cue()
(#882, @sworland-thyme, @joelnitta). - Implement a
seed
argument intar_cue()
to control whether targets update in response to changing orNA
seeds (#882, @sworland-thyme, @joelnitta). - Reduce the number of per-target AWS/GCP storage API calls. Previously there were 3 API calls per target, including 2 HEAD requests. Now there is just 1 for a typical target (unless dependencies have to be downloaded). Relies on S3 strong read-after-write consistency (#958).
- Update the
tar_github_actions()
workflow file to use@v2
(#960, @kulinar). - Print helpful hints while debugging a target interactively (#961).
- Only attempt to debug a target when
callr_function
isNULL
(#961). - Make formats
"feather"
,"parquet"
,"file"
, and"url"
work witherror = "null"
(#969). - Declare formats
"keras"
and"torch"
superseded bytar_format()
. Documented in thetar_target()
help file. - Declare formats
"keras"
and"torch"
incompatible witherror = "null"
. Documented in thetar_target()
help file and in a warning thrown bytar_target()
viatar_target_raw()
. - Add a
convert
argument totar_format()
to allow customstore_convert_object()
methods (#970).