Skip to content

RFC: Add features to reduce dependencies on core crate #15907

Open
@timsaucer

Description

@timsaucer

Is your feature request related to a problem or challenge?

Currently when you add the datafusion crate, it pulls in many dependencies that are not needed for all use cases. We have two specific projects in mind:

  • In comet the work is done at the level of the Physical plan. It would be convenient to not have to pull in the sql parsing or the logical plan.
  • One customer has a use case where they will not be doing any sql parsing, and do not need any sql support. Specifically, they are trying to build in web assembly and the full imports are causing large bloat of the generated binaries.

The purpose of this issue is to discuss other use cases and where we may these flags in.

Describe the solution you'd like

Add a few feature flags so the dependency graph is greatly reduced. From early discussions in the datafusion community meeting these might be sql and logical_plan but you might imagine others.

Describe alternatives you've considered

No response

Additional context

During the community meeting, @mbutrovich suggested that databases typically have these general steps we can break down into

  • Parsing SQL
  • Building logical plans
  • Optimizing logical plans
  • Building physical plans
  • Optimizing physical plans
  • Execution

I hope I captured correctly, but he suggested we could create two feature flags, one for the parsing SQL stage and one for the building / optimizing logical plans. Since these stages are done in order, if someone opted in to sql they would likely need everything that follows.

These are the notes I tried to capture from the meeting.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions