-
Notifications
You must be signed in to change notification settings - Fork 605
Question: why is the Visitor trait limited to statements, relations & expressions? #934
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The core reasons are to keep this crate easy to extend with new sql syntax by casual contributors and limited maintenance bandwidth. The current visitor pattern / macros require a single new In terms of your proposed implementation, it seems reasonable. I would be interested in knowing how it would work when a contributor added a new node, like #897 How would we ensure that these new nodes were added to the perhaps @lovasoa and @tustvold have some other perspectives Basically I think a general purpose rewriter / visitor would be a great addition, as long as we don't make it too hard to add new code to this repo |
@freshtonic AST visiting in sqlparser-rs has a long history. It took very long for a design to be agreed on and merged. I had proposed an alternative, more general, approach that would have allowed to do what you wanted: #601 But it was rejected, by fear it would take too much work to maintain. My proposition to embark new maintainers for the feature (myself and @nikis05) was rejected. sqlparser-rs is at the very core of SQLPage, so I can reiterate my proposition: I wouldn't mind being added as a maintainer here :) |
Indeed -- I am looking for help maintaining it for sure. I think you can do all the time consuming parts of maintenance (PR reviews, answering questions / tickets, etc) without having write access to the repository. I am waiting for someone to actually show the initiative to do this work (there are plenty of people willing to write code). I had a promising candidate in @AugustoFKL (see #808) but sadly there were some circumstances that required them to stop working here I will write up a discussion explaining the current state of affairs Update: https://github.com/sqlparser-rs/sqlparser-rs/discussions/940 |
That's a great question. Assuming the parent node of the new node has a derived Visitor instance then compilation would fail due to the derived code not being able to find a matching Ideally, the |
You don't need to create a huge enum. You can use the built in Any trait. Have a look at my PR ;) |
@lovasoa depends on if you want to |
FWIW #951 added visiting TableFactor as well |
Update: I've been experimenting with various implementation strategies to generalise the AST visitor. Nothing to ready to show yet, but this is business critical for my company so it's a high priority to figure out something that works well. I'll update this issue with a link to a draft PR hopefully in a matter of days. |
Progress update - I have a working generalised visitor implementation. I'll put up a PR later this week when I've done some more testing. |
Is this still happening? Generalised visitor implementation would be appreciated. |
@osaton @alamb yes this is still happening. Apologies for the lack of comms. TL;DR I ended up taking quite a different approach to what I originally planned, due to our (the company I work for) evolving needs. The comprehensive Visitor implementation will be initially released as its own crate. My original intention was to release a PR against Summary of features so far:
In order to pull that off, I needed to write a I need to write up some docs but I would love some feedback on the code and general approach before I publicly publish the repo. I can share it privately if either of you are interested in taking a look. |
Three features that I really wanted from
It's possible for multiple AST nodes at different places in the tree that have the same type to have the same hash and compare as equal. This seems counter intuitive, but two syntactically identical nodes can be different semantically. Not having robust (and cheap) means of uniquely identifying a node makes building HashMaps of derived metadata keyed by node particularly tricky.
This would be super useful for debugging and reporting errors nicely (like
It would be nice to not have to match an enum in order to pluck out a variant. Instead I'd like be able to implement 1 & 2 could be solved by wrapping every struct field or enum variant field in a |
@freshtonic Your visitor crate sounds perfect for my use case. I would love to try it but I'm not sure if I'm able to give any constructive criticism about the code as I'm fairly new to Rust and systems programming languages in general.
I don't know if this can be done without breaking changes, but having both 1 and 2 would be amazing. |
+1 to this – would be curious to try out your crate @freshtonic if/when it gets to a publishable state :) |
@alamb In the meantime, would you be open to a PR that basically just adds |
@ryb73 I built Note that the docs building currently fails on docs.rs due to very clunky build step to work around being unable to derive traits for foreign types in Rust. The workaround I'm considering is to fork We're using it in production at CipherStash so it's "production ready" in that sense but we'd appreciate some community feedback and PRs to fix/improve it :) |
Sorry for the late reply -- Option 2 above seems like a reasonable idea to me. @iffyio does that sound reasonable to you? |
Yeah I think option 2 above sounds reasonable! |
FWIW I quickly implemented option 1 here in the meantime, if you want to take a look. That's indeed a large chunk of boilerplate code. Anyone wants to work on option 2? Otherwise I may be able to land something next week, but I don't want to guarantee anything. |
@ramnes anything I can do to help you make your Visitor changes into a PR? @alamb is there a realistic chance of @ramnes work being merged? If it lands, it potentially means |
I can open a PR for this commit, but is this the direction we want to go in? I thought folks here wanted option 2 (which I didn't have the time to work on so far.) |
@ramnes I just realised it's not 100% clear to me what you mean by "Option 2" - are you referring to my comment about baked in span information in all nodes? Because that's something that's being gradually rolled out already. |
I meant option 2 here. Isn't it what @alamb and @iffyio meant as well? 😅 |
@ramnes thank you for clarifying! I'm 100% in favour of any style of
Option 1 is (subjectively) ugly but gets the job done. Option 2 has a cleaner trait but implementations must downcast the node to do anything useful with it. Option 3 is just as clean as 2 and downcasting is no longer required in every implementation, but The approach to I have been thinking about alternative approach to avoid all of the explicit casting and just generate the code. Here's the playground Some background. 20 months ago(!) when I created this issue I was tasked with creating a type-inferencer for the CRUD portion of the SQL grammar as part of my job at CipherStash. What I didn't know then (but I know now) is that full AST coverage was not necessary for our particular problem. Currently the only AST nodes that we need to visit are: But I can see a need for access to more AST nodes as we extend our product into other areas of SQL analysis. There would always be workarounds: just implement a |
@ramnes oh and one more thing that's very important. For workloads such as analysing an AST (in my case, deriving type information for a subset of node types) a The problem is that That means the |
What is the reason for that particular design decision versus providing a more general
Visitor
implementation?Two options for a generalised Visitor trait come to mind:
pre_visit
+post_visit
) with signatures likefn pre_visit(&mut self, node: &AstNode) -> ControlFlow<Self::Break>
- whereAstNode
is an enum with a wrapper variant for every AST node type found insrc/ast/mod.rs
and can bematch
ed against.Would the maintainers be interested in a PR that implements one of the above two approaches?
My preference would be for option 2 because it would not break the trait when node types are added/removed.
Suggested approach:
RawVisitor
trait (andRawVisitorMut
trait) like this:RawVisitorAdapter
?) that accepts aV: Visitor
generic argument and implementsRawVisitor
&RawVisitorMut
, which calls the appropriate method onV
(or none at all)Visit
derivation macros to generate code in terms ofRawVisitor
&RawVisitorMut
instead ofVisitor
, like this:The text was updated successfully, but these errors were encountered: