Skip to content

Defaults for the storage, retrieval, and memory settings #1426

@wlandau

Description

@wlandau

Discussed in #1372

Originally posted by wlandau November 10, 2024

Help

Description

In highly parallel workloads, I almost always recommend storage = "worker" and retrieval = "worker" so that the parallel workers manage the data instead of putting the burden on the main process. The default values for both are "main" only because I envisioned cloud-based workloads where data may need to travel over the network. In practice, cloud pipelines almost always have some kind of object storage (or database storage) such as an S3 bucket. So I think I should set "worker" as the default for both. But first, I would just like to check with those of you who follow the discussions to see if this would have any unintended consequences.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions