How to efficiently schedule per-user Dagster jobs? #32843

Riahiamirreza · 2025-11-18T11:45:39Z

Riahiamirreza
Nov 18, 2025

I’m building a Dagster orchestration pipeline to process heart-rate (time-series) data for my users. Each user’s data is processed via a separate job. However, many users have little or no new data between ticks, so I don’t want to launch unnecessary runs.

Here’s my current challenge:

I defined a ScheduleDefinition with a cron schedule (every 10 minutes).
The schedule’s execution_fn runs heavy logic: for each user, it checks whether there’s new data since the last run. This check is expensive since the number of users are too much and for each user we need to query database, and it’s taking more than 60 seconds. As a result, the schedule evaluation always times out.
If I skip that check and just schedule a run for every user every tick, I launch a lot of jobs that immediately exit (because there’s no data) — which is resource-inefficient.

My question:

Is this an anti-pattern in Dagster (doing heavy domain logic inside execution_fn)?
What is the recommended best practice:
- Perform the check in my schedule function (execution_fn) and increase the timeout (if possible).
- Creating tons of jobs which many of them exit at the first step.

The second approach is way too simpler, and I initially implemented that, but it soon failed. The number of jobs was too much and it caused many queued jobs after a couple of days. The huge number of queued jobs (about 500k) caused the system to stall. Because everytime Dagster wanted to run a job, it fetched ALL queued jobs from DB and tried to sort them based on priority, and since the number of queued jobs was very high, it caused a long delay to start new jobs.

Would switching to a sensor (instead of a schedule) make more sense here, given the polling and data-driven nature of the check?
Are there particular Dagster configuration knobs (e.g. daemon / code-server settings) that can help avoid the timeout, if I absolutely must run some logic in the schedule? I have seen DAGSTER_SENSOR_GRPC_TIMEOUT_SECONDS, but I'm not sure that applies to schedules as well or not.

HynekBlaha · 2025-11-20T08:35:23Z

HynekBlaha
Nov 20, 2025

Hey @Riahiamirreza,
I would recommend to:

Increase DAGSTER_SENSOR_GRPC_TIMEOUT_SECONDS (it applies also to schedules).
Rewrite the query to return all users with new data in one go
Parallelize the queries (threading/async) if above is not possible.
Create multiple schedules: In each only hash(user_id) % N == 0 would be processed.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to efficiently schedule per-user Dagster jobs? #32843

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How to efficiently schedule per-user Dagster jobs? #32843

Uh oh!

Uh oh!

Riahiamirreza Nov 18, 2025

Replies: 1 comment

Uh oh!

HynekBlaha Nov 20, 2025

Riahiamirreza
Nov 18, 2025

HynekBlaha
Nov 20, 2025