Improved pipeline attach command and Dashboard launcher extensions #3060

sh-rp · 2025-09-04T09:45:04Z

Description

This PR

allows to sync a pipeline from destination if it is not found locally by the attach command.
Allows to set the port and host when launching the dashboard programmatically

netlify · 2025-09-04T09:45:09Z

✅ Deploy Preview for dlt-hub-docs canceled.

Name	Link
🔨 Latest commit	`1a6724f`
🔍 Latest deploy log	https://app.netlify.com/projects/dlt-hub-docs/deploys/68c82a9ae603fd00085edc69

sh-rp · 2025-09-04T09:46:20Z

dlt/pipeline/__init__.py

+    dataset_name: str = None,
+    sync_if_missing: bool = False,
    **injection_kwargs: Any,
 ) -> Pipeline:


I'm not sure about the changes to def attach. Maybe we should have something like attach_remote with a reduced arg set that does this and leave the attach unchanged.

sh-rp · 2025-09-04T09:47:28Z

dlt/cli/deploy_command_helpers.py

+            if extended_info:
+                d_t_node = call_args.arguments.get("destination")
+                if d_t_node:
+                    destination = evaluate_node_literal(d_t_node)


this does not work for any destination that is not a string literal..

here we could also parse destination factory. but IMO we should not invest too much in AST right now

sh-rp · 2025-09-04T09:48:25Z

dlt/cli/command_wrappers.py

+            pipeline_name = pipeline_info["pipeline_name"]
+            pipelines_dir = pipeline_info["pipelines_dir"]
+
+            dlt.attach(


What can happen here is, that a user wants to open the pipeline as defined in the script, but there already is the state of some other pipeline with the same name on the local machine and that one is opened.

sh-rp · 2025-09-04T09:49:57Z

dlt/cli/command_wrappers.py


 @utils.track_command("dashboard", True)
-def dashboard_command_wrapper(pipelines_dir: Optional[str], edit: bool) -> None:
+def dashboard_command_wrapper(


alternatively do changing this top level dashboard command, we could also allow the pipeline command to not take a pipeline name but a script file. But that is probably quite confusing.

rudolfix · 2025-09-04T10:51:04Z

dlt/cli/deploy_command_helpers.py

+                if d_t_node:
+                    destination = evaluate_node_literal(d_t_node)
+                    if destination is None:
+                        raise CliCommandInnerException(


just warn here to be backward compatible

rudolfix · 2025-09-04T10:53:23Z

dlt/cli/deploy_command_helpers.py

+            if extended_info:
+                d_t_node = call_args.arguments.get("destination")
+                if d_t_node:
+                    destination = evaluate_node_literal(d_t_node)


here we could also parse destination factory. but IMO we should not invest too much in AST right now

rudolfix · 2025-09-04T10:59:14Z

dlt/cli/command_wrappers.py


-    run_dashboard(pipelines_dir=pipelines_dir, edit=edit)
+    # if a pipeline script path is provided, we need to parse out pipeline info from script and sync it
+    pipeline_name: str = None


I get what you do here. but this code should be executed by deploy command (or not at all because deploy command has access to pipeline state and trace). I was possibly not specific when we discussed that:

workspace dashboard and report notebook should attach in the same way dlt.attach(pipeline_name) should be enough in both cases

it is the task of deployment script to generate additional information (form AST/state/trace) and add it to the job package. here it may just emit env variables:

PIPELINES__<pipeline_name>__DESTINATION_TYPE=... PIPELINES__<pipeline_name>__DESTINATION_NAME=... PIPELINES__<pipeline_name>__DATASET_NAME=... ...

attach will see it automatically even without those parameters being passed. look at the code

now the big question is how we gather this parameters. if you are against runtime information like using state or trace then we'll invest in AST parsing. but IMO it will never be as good

rudolfix · 2025-09-04T11:00:09Z

dlt/pipeline/__init__.py

        destination_name=injection_kwargs.get("staging_name", None),
    )
+
+    pipeline_kwargs = {


I think there's a cleaner way. will leave a comment separately

… path

…t script path" This reverts commit 46edfa0.

rudolfix

to me it looks pretty OK. see my comments:

we need to test edge cases in my comments
look where attach is used. AFAIK we use it in cli to restore pipeline:

try:
        if verbosity > 0:
            fmt.echo("Attaching to pipeline %s" % fmt.bold(pipeline_name))
        p = dlt.attach(pipeline_name=pipeline_name, pipelines_dir=pipelines_dir)
    except CannotRestorePipelineException as e:
        if operation not in {"sync", "drop"}:
            raise
        fmt.warning(str(e))
        if not fmt.confirm(
            "Do you want to attempt to restore the pipeline state from destination?",
            default=False,
        ):
            return
        destination = destination or fmt.text_input(
            f"Enter destination name for pipeline {fmt.bold(pipeline_name)}"
        )
        dataset_name = dataset_name or fmt.text_input(
            f"Enter dataset name for pipeline {fmt.bold(pipeline_name)}"
        )
        p = dlt.pipeline(
            pipeline_name,
            pipelines_dir,
            destination=destination,
            dataset_name=dataset_name,
        )
        p.sync_destination()
        if p.first_run:
            # remote state was not found
            p._wipe_working_folder()
            fmt.error(
                f"Pipeline {pipeline_name} was not found in dataset {dataset_name} in {destination}"
            )
            return
        if operation == "sync":
            return  # No need to sync again

which looks like what you already implemented :)

rudolfix · 2025-09-12T14:14:17Z

dlt/pipeline/__init__.py

-    # set it as current pipeline
-    p.activate()
-    return p
+    try:


please allow for explicit dataset name in args.

rudolfix · 2025-09-12T14:18:29Z

dlt/pipeline/__init__.py

+        return p
+    except CannotRestorePipelineException:
+        # we can try to sync a pipeline with the given name
+        p = pipeline(pipeline_name, pipelines_dir, destination=destination, staging=staging)


we can attempt destination sync only if destination is set. otherwise rer-aise the exception.

rudolfix · 2025-09-12T14:22:07Z

dlt/pipeline/__init__.py

+    except CannotRestorePipelineException:
+        # we can try to sync a pipeline with the given name
+        p = pipeline(pipeline_name, pipelines_dir, destination=destination, staging=staging)
+        p.sync_destination()


note: this can raise PipelineStepFailed if destination state is for some reason broken. this is OK to do

if p.first_run is True - means that there's no remote state. in that case you should wipe the pipeline working dir that got created by dlt.pipeline and reraise original exception

see my review message. it looks like it is already implemented in pipeline_command

rudolfix

LGTM!

sh-rp commented Sep 4, 2025

View reviewed changes

rudolfix reviewed Sep 4, 2025

View reviewed changes

sh-rp force-pushed the feat/improved_synching branch from f1d5a8f to 46edfa0 Compare September 9, 2025 15:28

sh-rp added 3 commits September 9, 2025 17:38

prototype for remote attaching and launching dashboard against script…

ffbc6a3

… path

Revert "prototype for remote attaching and launching dashboard agains…

da52b84

…t script path" This reverts commit 46edfa0.

sync_destination if pipeline not attachable

5e8450e

sh-rp force-pushed the feat/improved_synching branch 2 times, most recently from c9ddb93 to 3e8a486 Compare September 11, 2025 14:29

add port argument to dashboard launcher

2c8ed76

sh-rp force-pushed the feat/improved_synching branch from 3e8a486 to 2c8ed76 Compare September 11, 2025 14:29

add host arg to marimo

0305154

rudolfix reviewed Sep 12, 2025

View reviewed changes

sh-rp changed the title ~~PoC: Improved dashboard launching and database synching~~ Improved pipeline attach command and Dashboard launcher extensions Sep 15, 2025

incorporate review notes and test all edge cases

cfad63e

sh-rp requested a review from rudolfix September 15, 2025 11:09

sh-rp self-assigned this Sep 15, 2025

wording fixes

1a6724f

rudolfix approved these changes Sep 15, 2025

View reviewed changes

rudolfix marked this pull request as ready for review September 15, 2025 17:50

sh-rp merged commit e0c6d20 into devel Sep 16, 2025
67 checks passed

sh-rp deleted the feat/improved_synching branch September 16, 2025 11:28

Improved pipeline attach command and Dashboard launcher extensions #3060

Improved pipeline attach command and Dashboard launcher extensions #3060

Uh oh!

Conversation

sh-rp commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

netlify bot commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for dlt-hub-docs canceled.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sh-rp Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rudolfix left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rudolfix left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sh-rp commented Sep 4, 2025 •

edited

Loading

netlify bot commented Sep 4, 2025 •

edited

Loading

sh-rp Sep 4, 2025 •

edited

Loading